r - R脚本-NLS无法正常工作

原文 标签 r least-squares nls

R script - NLS not working

I have 5 (x,y) data points and I'm trying to find a best fit solution consisting of two lines which intersect at a point (x0,y0), and which follow these equations:

y1 = (m1)(x1 - x0) + y0
y2 = (m2)(x2 - x0) + y0

Specifically, I require that the intersection must occur between x=2 and x=3. Have a look at the code:

#Initialize x1, y1, x2, y2
x1 <- c(1,2)
y1 <- c(10,10)

x2 <- c(3,4,5)
y2 <- c(20,30,40)

g <- c(TRUE, TRUE, FALSE, FALSE, FALSE)

q <- nls(c(y1, y2) ~ ifelse(g == TRUE, m1 * (x1 - x0) + y0, m2 * (x2 - x0) + y0), start = c(m1 = -1, m2 = 1, y0 = 0, x0 = 2), algorithm = "port", lower = c(m1 = -Inf, m2 = -Inf, y0 = -Inf, x0 = 2), upper = c(m1 = Inf, m2 = Inf, y0 = Inf, x0 = 3))
coef <- coef(q)
m1 <- coef[1]
m2 <- coef[2]
y0 <- coef[3]
x0 <- coef[4]

#Plot the original x1, y1, and x2, y2
plot(x1,y1,xlim=c(1,5),ylim=c(0,50))
points(x2,y2)

#Plot the fits
x1 <- c(1,2,3,4,5)
fit1 <- m1 * (x1 - x0) + y0
lines(x1, fit1, col="red")

x2   <- c(1,2,3,4,5)
fit2 <- m2 * (x2 - x0) + y0
lines(x2, fit2, col="blue")

So, you can see the data points listed there. Then, I run it through my nls, get my parameters m1, m2, x0, y0 (the slopes, and the intersection point).

But, take a look at the solution: enter image description here

Clearly, the red line (which is supposed to only be based on the first 2 points) is not the best line of fit for the first 2 points. This is the same case with the blue line (the 2nd fit), which supposed to be is dependent on the last 3 points). What is wrong here?

Answer

I'm not exactly sure what's wrong but I can get it to work by rearranging things a bit. Please note the comment in ?nls about "Do not use ‘nls’ on artificial "zero-residual" data."; I added a bit of noise.

## Initialize x1, y1, x2, y2
x1 <- c(1,2)
y1 <- c(10,10)

x2 <- c(3,4,5)
y2 <- c(20,30,40)

## make single x, y vector
x <- c(x1,x2)
set.seed(1001)
## (add a bit of noise to avoid zero-residual artificiality)
y <- c(y1,y2)+rnorm(5,sd=0.01)

g <- c(TRUE,TRUE,FALSE,FALSE,FALSE) ## specify identities of points

## particular changes:
##   * you have lower=upper=2 for x0.  Did you want 2<x0<3?
##   * specified data argument explicitly (allows use of predict() etc.)
##   * changed name from 'q' to 'fit1' (avoid R built-in function)
fit1 <- nls(y ~ ifelse(g,m1,m1+delta_m)*(x - x0) + y0,
         start = c(m1 = -1, delta_m = 2, y0 = 0, x0 = 2),
         algorithm = "port",
         lower = c(m1 = -Inf, delta_m = 0, y0 = -Inf, x0 = 2),
         upper = c(m1 = Inf, delta_m = Inf, y0 = Inf, x0 = 3),
         data=data.frame(x,y))

#Plot the original 'data'
plot(x,y,col=rep(c("red","blue"),c(2,3)),
           xlim=c(1,5),ylim=c(0,50))

## add predicted values
xvec <- seq(1,5,length.out=101)
lines(xvec,predict(fit1,newdata=data.frame(x=xvec)))

edit: based ifelse clause on point identity, not x position

edit: changed to require second slope to be > first slope

On a second look, I think the issue above is probably due to the use of separate vectors for x1 and x2 above, rather than a single x vector: I suspect these got replicated by R to match up with the g vector, which would have messed things up pretty badly. For example, this stripped-down example:

g <- c(TRUE, TRUE, FALSE, FALSE, FALSE)
ifelse(g,x1,x2)
## [1] 1 2 5 3 4

shows that x2 gets extended to (3 4 5 3 4) before being used in the ifelse clause. The scariest part is that normally one gets a warning such as this:

> x2 + 1:5
[1] 4 6 8 7 9
Warning message:
In x2 + 1:5 :
  longer object length is not a multiple of shorter object length

but in this case there is no warning ...

翻译

我有5(x,y)个数据点,并且我试图找到一个最合适的解决方案,该解决方案由两条在点(x0,y0)处相交的线组成,并遵循以下公式:

y1 = (m1)(x1 - x0) + y0
y2 = (m2)(x2 - x0) + y0


具体来说,我要求交集必须出现在x = 2和x = 3之间。看一下代码:

#Initialize x1, y1, x2, y2
x1 <- c(1,2)
y1 <- c(10,10)

x2 <- c(3,4,5)
y2 <- c(20,30,40)

g <- c(TRUE, TRUE, FALSE, FALSE, FALSE)

q <- nls(c(y1, y2) ~ ifelse(g == TRUE, m1 * (x1 - x0) + y0, m2 * (x2 - x0) + y0), start = c(m1 = -1, m2 = 1, y0 = 0, x0 = 2), algorithm = "port", lower = c(m1 = -Inf, m2 = -Inf, y0 = -Inf, x0 = 2), upper = c(m1 = Inf, m2 = Inf, y0 = Inf, x0 = 3))
coef <- coef(q)
m1 <- coef[1]
m2 <- coef[2]
y0 <- coef[3]
x0 <- coef[4]

#Plot the original x1, y1, and x2, y2
plot(x1,y1,xlim=c(1,5),ylim=c(0,50))
points(x2,y2)

#Plot the fits
x1 <- c(1,2,3,4,5)
fit1 <- m1 * (x1 - x0) + y0
lines(x1, fit1, col="red")

x2   <- c(1,2,3,4,5)
fit2 <- m2 * (x2 - x0) + y0
lines(x2, fit2, col="blue")


因此,您可以在此处看到列出的数据点。然后,通过nls运行它,获取参数m1m2x0y0(坡度和交点)。

但是,请看一下解决方案:


显然,红线(应该仅基于前两个点)不是最适合前两个点的线。蓝线(第二拟合)也是如此,这取决于最后三个点。怎么了
最佳答案
我不确定到底出了什么问题,但是我可以通过重新整理一点东西使其工作。请注意?nls中有关“请勿在人工“零残差”数据上使用'nls'的注释。”我加了一点噪音。

## Initialize x1, y1, x2, y2
x1 <- c(1,2)
y1 <- c(10,10)

x2 <- c(3,4,5)
y2 <- c(20,30,40)

## make single x, y vector
x <- c(x1,x2)
set.seed(1001)
## (add a bit of noise to avoid zero-residual artificiality)
y <- c(y1,y2)+rnorm(5,sd=0.01)

g <- c(TRUE,TRUE,FALSE,FALSE,FALSE) ## specify identities of points

## particular changes:
##   * you have lower=upper=2 for x0.  Did you want 2<x0<3?
##   * specified data argument explicitly (allows use of predict() etc.)
##   * changed name from 'q' to 'fit1' (avoid R built-in function)
fit1 <- nls(y ~ ifelse(g,m1,m1+delta_m)*(x - x0) + y0,
         start = c(m1 = -1, delta_m = 2, y0 = 0, x0 = 2),
         algorithm = "port",
         lower = c(m1 = -Inf, delta_m = 0, y0 = -Inf, x0 = 2),
         upper = c(m1 = Inf, delta_m = Inf, y0 = Inf, x0 = 3),
         data=data.frame(x,y))

#Plot the original 'data'
plot(x,y,col=rep(c("red","blue"),c(2,3)),
           xlim=c(1,5),ylim=c(0,50))

## add predicted values
xvec <- seq(1,5,length.out=101)
lines(xvec,predict(fit1,newdata=data.frame(x=xvec)))


编辑:基于点标识而不是x位置的ifelse子句

编辑:更改为要求第二斜率大于第一斜率

从第二个角度看,我认为上面的问题可能是由于对x1x2使用了单独的向量,而不是单个x向量:我怀疑R复制了这些向量以使其与g向量,这会使事情变得非常糟糕。例如,以下简化示例:

g <- c(TRUE, TRUE, FALSE, FALSE, FALSE)
ifelse(g,x1,x2)
## [1] 1 2 5 3 4


显示x2(3 4 5 3 4)子句中使用之前已扩展到ifelse。最可怕的部分是通常会有这样的警告:

> x2 + 1:5
[1] 4 6 8 7 9
Warning message:
In x2 + 1:5 :
  longer object length is not a multiple of shorter object length


但是在这种情况下,没有警告...
相关推荐

r - 绘制椭圆边界点的百分比

r - Segmentation Fault使用R中的xts对象列表

r - 使用R自动添加

r - R中的Anova:数据框选择

r - 如何检查R的下载映像,以确保在镜像过程中未对其进行篡改或破坏?

r - 使用data()作为通用S4函数

r - 如何从R中的开始日期和结束日期列表中计算组中的当前人数

r - 在R中处理多个文件

r - 使用qplot绘制线

r - 如何为动物园对象(晶格)的xyplot设置ylim