Modeling response times

not an expert, but maybe the ex-gaussian (gaussian plus exponential distribution)?pdf

From the last link:

In the framework of cognitive
processes, this convolution can be
seen as representing the overall
distribution of RT [Response Time] resulting from two
additive or sequential processes.

Intraclass correlation and aggregation

Thus, my questions are:

What descriptive labels would you attach to different values of the intra-class correlation?, the aim is to actually relate the values of the intra-class correlation to qualitative language such as: "When the intraclass correlation is greater than x, it suggests that the attitudes are modestly/moderately/strongly shared across team members.

Appropriate normality tests for small samples

So far, I've been using the Shapiro-Wilk statistic in order to test normality assumptions in small samples.The fBasics package in R (part of Rmetrics) includes several normality tests, covering many of the popular frequentist tests -- Kolmogorov-Smirnov, Shapiro-Wilk, Jarque–Bera, and D'Agostino -- along with a wrapper for the normality tests in the nortest package -- Anderson–Darling, Cramer–von Mises, Lilliefors (Kolmogorov-Smirnov), Pearson chi–square, and Shapiro–Francia.

Comparing 2 independent non-central t statistics

The sample Sharpe ratio is the sample mean divided by the sample standard deviation.Up to a constant factor ($\sqrt{n}$, where $n$ is the number of observations), this is distributed as a (possibly non-central) $t$-statistic.

Constructing smoothing splines with cross-validation

Can someone provide me with a book or online reference on how to construct smoothing splines with cross-validation?I would also appreciate an overview of whether this is smoothing technique is a good one for smoothing data and whether there are any disadvantages of which a non-statistician needs to be aware.

Using information geometry to define distances and volumes…useful?

I came across a large body of literature which advocates using Fisher's Information metric as a natural local metric in the space of probability distributions and then integrating over it to define distances and volumes.But are these "integrated" quantities actually useful for anything?

Is there a way to remember the definitions of Type I and Type II Errors?


Since type two means "False negative" or sort of "false false", I remember it as the number of falses.If you believe such an argument:

Type I errors are of primary concern
Type II errors are of secondary concern

Note: I'm not endorsing this value judgement, but it does help me remember Type I from Type II.

Survival Analysis tools in Python

I am wondering if there are any packages for python that is capable of performing survival analysis.I have been using the survival package in R but would like to port my work to python.

What are alternatives to broken axes?


(3) You can show the broken plot side-by-side with the same plot on unbroken axes.(4) In the case of your bar chart example, choose a suitable (perhaps hugely stretched) vertical axis and provide a panning utility.

Shall I trust AIC (non-full model) or slope (full model)?

The purpose to run regressions for butterfly richness again 5 environmental variables is to show the importance rank of the independent variables mainly by AIC.In non-full models, they reveal that variable A tends to be more influential than the others by delta AIC.

Variation in PCA weights

I have weights of SNP variation (output through Eigenstrat program) for each SNP for the three main PCs.g:

SNPNam PC1-wt PC 2-wt PC3-wt
SNP_1 -1.