MLE with optim(), error in "gradient in optim evaluated to length 1 not 3"
I do a poisson regression with the optim function. I calculatetd the gradient of the loglikelihood and try to pass it as an argument to the optim function and as a result i get an error. ""gradient in optim evaluated to length 1 not 3" Where is my mistake? (Sorry for the german comments)
# Funktion zur Berechnung der LogLikelihood
logLikePois < function(y, x, parameter) {
# Anzahl der zu sch?tzenden Koeffizienten
betaKoef < parameter
# Lamba der Poissonregression
lambda < exp(betaKoef %*% t(x))
# Negative Loglikelihood bassiend aus Lambda. Negativ, da die optim Funktion
# nur minimiert und die Likelihood aber maximiert werden soll
logLikeliHood < sum(lambda+y*log(lambda)log(factorial(y)))
return(logLikeliHood)
}
grad < function (y,x,parameter){
betaKoef < parameter
# Lamba der Poissonregression
lambda < exp(betaKoef%*%t(x))
gradient < sum((yexp(lambda))%*%x)
return(gradient)
}
# Ausgeben der Loglikelihood
data(discoveries)
disc < data.frame(count=as.numeric(discoveries),
year=seq(0,(length(discoveries)1),1))
yearSqr < disc$year^2
formula < count~year+yearSqr
form < formula(formula)
# DataFrame wird erzeugt
model < model.frame(formula, data = disc)
# Designmatrix wird erzeugt
x < model.matrix(formula, data = disc)
# Response Variable erzeugt
y < model.response(model)
parFullModell < c(1,0,0)
optimierung < optim(par = parFullModell, fn = logLikePois,
x = x, y = y,gr=grad ,method = "BFGS", hessian = TRUE)
See also questions close to this topic

Finding Solution for an timetable maker
Hello im actually working on a timetable maker and cant think of an satisfiable solution, as this is probably an NPComplete problem. What i want to create specifically is a tool, to help planing workinggroups of the serveral subjects on university.
Lets get into that a little bit more in detail: 4 instances are given:
1.Subjects: There exists a list of subjects (maths, programming, algorithms) etc. Each subject contains a list of students who are participating on the lecture.
2.Timeslots: A List of timeslots is given, in which we can assign workinggroups to. (Like Monday 1. lesson) Multiple workinggroups (also from the same subject) can be assigned to the same timeslot.
3.workinggroups: We assign students into workinggroups. A group is related to an subject and has a minimum and maximum size.
 Students: Are the students we assign to an group. Each student has preferences according to timeslots and students, they want to be in a group with. There are timeslots which students prefer, cant be assigned to (e.g because they are in another working group at that time, or other reasons), or dont care about.
The task would be to find an optimal solution.
But before I would attack the optimal solution, I at least like to find a feasible solution. The hard constraints i would like to meet is:
that every student gets a place in an workinggroup, of the subject he is assigned to, without conflict. So that his workinggroups dont run at the same timeslot.
might be similar to nr.1 but: that the student wont be assigned to the timeslots which the student has marked as "cant be assigned to"
that the group size boundaries are not violated.
There are actually just 2 solutions that come to my mind. 1.Allocating the groups randomly and than switching groups between timeslots  and switching students between groups randomly until a feasible solution was created (could take forever) 2.or instead of switching the objects randomly, use something like backtracking.
But also backtracking can take a hell of time. Considering all the possibilities that can be created given the task. With lets say 60 students, i'm not sure if thats a acceptable solution.
What do you think of that problem?

Gulp PostCSS task slowing down machine
I 'wrote' this taskrunner a couple of days ago, but I'm having an issue with it. To avoid the XY problem, I'll start by explaining what my problem is. When the task runs, for the first 1015 minutes, in most cases, everything is fine, but then my computer slows down drastically to such extend that it barely works. I think its because the task is watching all of the scss files for changes and when I press save it wastes more resources that it needs to, because I'm only modifying a single file, but the task is watching and checking all of them for a change.
var gulp = require('gulp'), watch = require('gulpwatch'), postcssOrigin = require('postcss'), postcss = require('gulppostcss'), postcssSCSS = require('postcssscss'), browserSync = require('browsersync').create(), postcssImport = require('postcssimport'), postcssSimpleVars = require('postcsssimplevars'), postcssCalc = require('postcsscalc'), postcssColor = require('postcsscolorfunction'), postcssMixins = require('postcssmixins'), postcssExtend = require('postcsssimpleextend'), postcssNesting = require('postcssnesting'), autoprefixer = require('autoprefixer'), concatCSS = require('gulpconcatcss'); var handleError = function(error) { console.log(error.toString()); this.emit('end'); } // gulp.task('processhtml', function(){ // return gulp.src('./app/*.html') // .pipe('') // .pipe(gulp.dest('./temp/')); // }); gulp.task('browsersync', function(){ browserSync.init({ notify: false, server: { baseDir: 'app' } }); watch('./app/*.html', function(){ browserSync.reload(); }); watch('./app/css/scss/**/*.scss', function(){ gulp.start('cssInject') }); }); // Processcss task  !! This one is ran as a dependency of the cssInject task. gulp.task('processcss', function(){ return gulp.src('./app/css/scss/main.scss') .pipe(postcss([ postcssImport, postcssMixins, postcssCalc, postcssSimpleVars, postcssColor, postcssExtend, postcssNesting, autoprefixer], {syntax: postcssSCSS})) .on('error', handleError) .pipe(concatCSS('style.css')) .on('error', handleError) .pipe(gulp.dest('./app/css/')); }); // CSS Inject task gulp.task('cssInject', ['processcss'], function(){ return gulp.src('./app/css/style.css') .pipe(browserSync.stream()); });
Is there something that can be done in order to optimize it , or is there something that is done in a wrong fashion and is causing this problem. The watch task for my .scss files kind of looks wrong to me. Is there a plugin that 'spares' resources somehow? Machine has 4gb ram and a Dualcore pentium (the newer ones) , its obviously not blazingly fast, but it should handle this with ease. When the machine starts to slow down I noticed that my hdd works at 100%, and my cpu is nowhere near its limits, if it makes any difference. Thanks.
Edit: I also tried Blendid a few days ago and it was doing the same thing.

Self organizing map optimization using traingle inequality
I am trying to optimize SOM using triangle inequality to improve the neighborhood search, my problem is I cant figure how I will integrate the triangle inequality codes into SOM codes using python or is there a simple way as a form of a single line of function. I really need an assistance

Unitspecific Trends and Rsquared near 1
I am currently working on a country panel dataset in which I am running a DifinDif regression including unit specific trends in Stata
My main concern is that the adjusted Rsquared obtained is really high, sometimes even 0.99. I am assuming this is a sign of some kind of mistake but I do not know how to correct it.
For the model, I have near 5000 observations. The number of countries are 201, I have 36 years and 5 control variables, then the number of parameters would be around 450.
Here I attach the code used:
xtset id_num year // id_num = id_country reg `outcome' i.treatment i.year i.id_num c.year#i.id_num `controls' if id_country!="USA" & `subgroup'==1, cluster(id_num)
In case is useful, this is the first part of the output
note: 201.id_num#c.year omitted because of collinearity Linear regression Number of obs = 4,789 F(39, 174) = . Prob > F = . Rsquared = 0.9994 Root MSE = .20753 (Std. Err. adjusted for 175 clusters in id_country)   Robust obesity_as  Coef. Std. Err. t P>t [95% Conf. Interval] + 1.treatment  .1847802 .1341994 1.38 0.170 .080088 .4496483  year  1981  .2162895 .0156983 13.78 0.000 .185306 .2472731 1982  .4461132 .0224864 19.84 0.000 .4017319 .4904944 1983  .6690157 .0281392 23.78 0.000 .6134777 .7245538 1984  .915047 .0311529 29.37 0.000 .8535609 .9765332 1985  1.177176 .0344991 34.12 0.000 1.109085 1.245266 1986  1.421679 .0389734 36.48 0.000 1.344758 1.498601 1987  1.68354 .0413294 40.73 0.000 1.601969 1.765112 1988  1.963494 .0440206 44.60 0.000 1.876611 2.050377 1989  2.236331 .0472635 47.32 0.000 2.143048 2.329615 1990  2.52923 .0498206 50.77 0.000 2.4309 2.62756

Bayesian and Frequentist linear regressions: different results
I've simulated three normal distributions with different means and standard deviations. Assume these are three groups of subjects. I wanted to see how a normal
lm()
would compare with abrm()
regression. The model wasScore ~ Group + (+1Subject)
. I haven't added a random intercept by group on purpose.set.seed(2) df = data.frame(Group = as.factor(rep(c("A", "B","C"), each = 120)), Subject = rep(paste("subject", seq(1, 9), sep = "_"), each = 40), Score = c(rnorm(120, 5, 2), rnorm(120, 7, 4), rnorm(120, 9, 6)))
It turns out that the results are quite different. Groups B and A are significantly different in the
lm()
, but their 95% HDI often includes zero in thebrm()
output (I understand that results will vary every time a model samples the posterior, but the bottom line is that thet value
in thelm()
model is 2.544, whereas the HDI in thebrm()
would lead me to remain skeptical.My question is: why is this happening? In other words, how do these two models compare re. their treatment of heteroscedastic data?

How to estimate accuracy of trained model on future data?
I use python machine learning (regression) to model and predict the output of a simulation software (input of simulation software is the detail of the house and output is energy consumption). My intention is to bypass simulation in my optimization algorithm to reduce time. Using different ML models (neural network and support vector machine) applying on 10,000 simulated records, I have got very good accuracy. But, in some cases the error is high.
I have used grid search with 10 fold CV to optimize hyperparameters of ML models. Then, I saved the model using pickle. Now, I want to use this model as the main engine for optimizing the performance of the studied buildings. I send input parameters to ML model and get back the predicted output (energy usage) values.
I want to check if there is any way to check confidence of trained model only based on the input values. I need this to check if the model is not very confident on this specific record, then I call simulation to calculate the output value. I mean can we approximately identify the records that might have a high error using the prediction model without now the actual values?

Changing EditText's line colors to a gradient
I'm trying to change the EditText's colors to gradient colors (not active color and active color). I've created
drawable/line_gradient.xml
that I can't seem to be able to put anywhere without crashing. I've tried putting it instyles.xml
as<item name="colorAccent">@drawable/line_gradient</item>
. I've tried addingandroid:backgroundTint="@drawable/line_gradient"
andandroid:drawableTint="@drawable/line_gradient"
. Is there anyway to add a gradient to EditText's active and nonactive colors i.e. the line under EditText?drawable/line_gradient.xml
<?xml version="1.0" encoding="utf8"?> <shape xmlns:android="http://schemas.android.com/apk/res/android" > <gradient android:angle="90" android:startColor="#F19E2D" android:endColor="#F71B1B" android:type="linear" /> <corners android:radius="0dp"> </corners> </shape>
activity_main.xml
<android.support.design.widget.TextInputLayout android:layout_width="match_parent" android:layout_height="wrap_content" android:layout_marginLeft="30dp"> <EditText android:id="@+id/loginEt" android:layout_width="match_parent" android:layout_height="wrap_content" android:hint="Username" /> </android.support.design.widget.TextInputLayout> </LinearLayout>

Gnuplot  How to insert gradient colour in 2d plot?
I am trying to make a hyperspherical contour plot of my data and don't know how to make the lines in gradient colour. I am not an expert, I got this model somewhere. What I am doing is:
set terminal postscript landscape enhanced color solid defaultplex "Helvetica" 14 set output "toc.eps" set surface set contour base set noclabel set size 1 set xtics 1 set ytics 1 set noztics set rrange [ * : * ] noreverse nowriteback # (currently [0:10] ) set trange [ * : * ] noreverse nowriteback # (currently [5:5] ) set urange [ * : * ] noreverse nowriteback # (currently [5:5] ) set vrange [ * : * ] noreverse nowriteback # (currently [5:5] ) set xlabel "" set xrange [ 1 : 1 ] noreverse nowriteback # (currently [10:10] ) set ylabel "" set yrange [ 1 : 1 ] noreverse nowriteback # (currently [10:10] ) set zlabel "" set zrange [ * : * ] noreverse nowriteback # (currently [10:10]) unset surface set angle rad set size square set xlabel "{/HelveticaBold=18 {/Symbol b}*}" offset 0.00000,0.4000000 set ylabel "{/HelveticaBold=18 {/Symbol g}*} " offset 0.400000,0.000000 Eh2kcmol = 627.509438586222 rot=360/3*pi/180 rot=0 set cntrparam levels 100 set cntrparam levels incremental 0.502860934,0.00502860934 set table "CNOA.res" splot "CNOA.dat" u ($2*cos($1rot)):($2*sin($1rot)):3 unset table set format x "%.0f" set format y "%.0f" set multiplot #HERE IS WHERE I WANT THE GRADIENT plot [1.2:1.5][1.2:1.5] "CNOA.res" u 1:2 w l lt 1 lw 1.0 ti "" set parame set xrange [1.2:1.5] set yrange [1.2:1.5] plot sin(t),cos(t) w l lt 1 lw 2 lc 0 ti "" unset multiplot
Any help is appreciated.

Gradient fill columns using ggplot2 doesn't seem to work
I would like to create a gradient within each bar of this graph going from red (low) to green (high).
At present I am using specific colours within
geom_col
but want to have each individual bar scale from red to green depending on the value it depicts.Here is a simplified version of my graph (there is also a
geom_line
on the graph (among several other things such as titles, axis adjustments, etc.), but it isn't relevant to this question so I have excluded it from the example):I have removed the hardcoded colours from the columns and tried using things such as
scale_fill_gradient
(and numerous other similar functions) to apply a gradient to said columns, but nothing seems to work.Here is what the output is when I use
scale_fill_gradient(low = "red", high = "green")
:What I want is for each bar to have its own gradient and not for each bar to represent a step in said gradient.
How can I achieve this using
ggplot2
?My code for the above (green) example:
ggplot(data = all_chats_hn, aes(x = as.Date(date))) + geom_col(aes(y = total_chats), colour = "black", fill = "forestgreen")

How do you make an histogram on R?
I have this poisson distribution: 0=59, 1=22, 2=9, 3=2, 4=0, 5=1 I have to make an histogram of the theoric probbabilities and of their relative frequency but I don't know how.

Hessian matrix with optim or numderiv package?
I do maximum likelihood estimation for a loglikelihood function of a poisson distribution. After the estimation i want to compute the standard errors of the coeffients. For that i need the hessian matrix. Now i dont know which function i should use to get the hessian matrix , optim() or hessian() from the numderiv package.
Both function give me different solution. And if i try to compute Standard errors from the hessian that i get from optim i get one NaN entry in the result. Whats the difference between these two functions for the compution of the hessian matrix?
logLikePois < function(parameter,y, z) { betaKoef < parameter lambda < exp(betaKoef %*% t(z)) logLikeliHood < (sum(lambda+y*log(lambda)log(factorial(y)))) return(logLikeliHood) } grad < function (y,z,parameter){ betaKoef < parameter # Lamba der Poissonregression lambda < exp(betaKoef%*%t(z)) gradient < ((ylambda)%*%(z)) return(gradient) } data(discoveries) disc < data.frame(count=as.numeric(discoveries), year=seq(0,(length(discoveries)1),1)) yearSqr < disc$year^2 formula < count ~ year + yearSqr form < formula(formula) model < model.frame(formula, data = disc) z < model.matrix(formula, data = disc) y < model.response(model) parFullModell < rep(0,ncol(z)) optimierung < optim(par = parFullModell,gr=grad, fn = logLikePois, z = z, y = y, method = "BFGS" ,hessian = TRUE) optimHessian < optimierung$hessian numderivHessian < hessian(func = logLikePois, x = optimierung$par, y=y,z=z) sqrt(diag(solve(optimHessian))) sqrt(diag(solve(numderivHessian )))

Computing the marginal likelihood of a Gaussian model in R with integrate()
I am trying to compute the marginal likelihood of a Gaussian model in R. More precisely, I am trying to integrate the likelihood over both a Gaussian prior on mu and a Gaussian prior on sigma, with some observations yi.
In other words, I am trying to compute:
I tried to write this in R using the following function (following a similar SA question here: Quadrature to approximate a transformed beta distribution in R):
marglik < function(data) { integrand < Vectorize(function(data, mu, sigma) { prod(dnorm(data, mu, sigma) ) * dnorm(mu, 110, 1) * dnorm(sigma, 10, 1) } ) integrate(integrand, lower = 0, upper = Inf, mu = 100, sigma = 10)$value }
Using this function, I can compute the marginal likelihood of the above model for a set of observations:
set.seed(666) d < rnorm(100, mean = 107.5, sd = 2.5) marglik(data = d) [1] 9.704133e24
However, the results I obtain with this procedure are quite different from results I obtain with grid approximation, or using other packages/softwares.
My question is then: is it possible to do this double integration with integrate ? If it is, how would you do that ?

Is there a restriction test I can use for GARCH models in ugarch from 'rugarch' package?
Is there a restriction test I can use for GARCH models in ugarch from 'rugarch' package?
I need to test the restriction of 'alpha1 + beta1 = 1' over my ugarch model using any statistics such as F, t, or even chisquare.
How can I do this?

Max Likelihood for coin toss
Consider the following coin tossing problem. For the first coin toss of each round, the coin lands heads with probability 𝑝. For each toss thereafter, the result matches the outcome of the first toss with probability 𝑞. Everything resets after the conclusion of each round of coin tosses. For instance, suppose we observed the following results (H is heads, T is tails): • Round 1: HHT • Round 2: TTT The probability of observing the sequence of tosses in the first round is 𝑝⋅𝑞⋅(1−𝑞) and the probability of observing the sequence of tosses in the second round is (1−𝑝)⋅𝑞⋅𝑞. Suppose you have data on 𝑁 rounds of coin tosses with 𝑛 tosses in each round.
The problem asks to write a likelihood function with p,q. So i would like to find the likelihood of all heads. which is L = p * q^(n1). To find the max likelihood, I use loglikelihood with partial derivative of p and q and set to 0. However, I end up with 1/p=0 and (n1)/q=0 which is undefined. Anyone know whats going on?