3グループのRNA-seqカウントデータのシミュレーション

3グループデータのモデル化

前回に続きましてRNA-seqの記事になります。まずは、3グループの場合のそれぞれのサンプル、遺伝子のカウントデータの平均パラメータを数式で簡単にモデル化していきたいと思います。こちらのページ(https://bi.biopapyrus.jp/rnaseq/analysis/de-analysis/3g-edger.html)にあるように、一般化線形モデルにおけるのリンク関数と線形予測子の考え方を用いて、ある遺伝子のサンプルごとの平均パラメータをベクトル \lambda で表すと、3つのグループの比較で各グループに3つの反復があるとき、以下のように考えることができます。

 
\log(\boldsymbol{\lambda}) = 
\begin{pmatrix}
1 & 0 & 0 \\
1 & 0 & 0 \\
1 & 0 & 0 \\
1 & 1 & 0 \\
1 & 1 & 0 \\
1 & 1 & 0 \\
1 & 0 & 1 \\
1 & 0 & 1 \\
1 & 0 & 1 \\
\end{pmatrix}
\begin{pmatrix}
\alpha \\
\beta_{1} \\
\beta_{2} \\
\end{pmatrix}

実際に計算して見ると\lambdaについて、以下のようになります。


\boldsymbol{\lambda} = 
\begin{pmatrix}
\exp(\alpha) \\
\exp(\alpha) \\
\exp(\alpha) \\
\exp(\alpha)\exp(\beta_1) \\
\exp(\alpha)\exp(\beta_1) \\
\exp(\alpha)\exp(\beta_1) \\
\exp(\alpha)\exp(\beta_2) \\
\exp(\alpha)\exp(\beta_2) \\
\exp(\alpha)\exp(\beta_2) \\
\end{pmatrix}

上から3つをGroup1、真ん中3つをGroup2、残り3つをGroup3のサンプルの平均とします。同じグループの反復は同じ平均を示していていい感じです。また、Group2はGroup1に対して\exp(\beta_1)倍だけ平均が異なり、Group3はGroup1に対して\exp(\beta_2)倍だけ異なる事がわかります。

2つのグループではGroup1とGroup2の発現の比(Fold change)を考えるだけでしたが、3つのグループを考えるときは2つのFold changeが考えられるということになります。

\alphaに関してはその遺伝子の全体的な発現レベルを決める役割を果たします。この値が小さければ全体的に発現が小さいということを表します。

では実際にこの式をもとに、前回の記事で考えた各発現パターンにおける平均パラメータ\lambdaを計算してみましょう。ここでは発現パターンについてのみ考えるので、\alpha = \log(100)で固定しています。

非発現変動遺伝子(nonDEG)

nonDEGの場合は簡単で、\beta_1 = 0, \beta_2 = 0とすれば良いです。

alpha <- log(100)
beta_1 <- log(1)
beta_2 <- log(1)
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121183611p:plain
nonDEG

発現変動遺伝子(1対他)

続いてあるグループでその他2つのグループと比べて発現変動している場合について考えます。Fold changeについては2とします。

G1 > G2 = G3

Group1においてのみ高発現の場合。

beta_1 <- log(1/2)
beta_2 <- log(1/2)
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121183653p:plain
G1 > G2 = G3

G1 < G2 = G3

Group1においてのみ低発現の場合。

beta_1 <- log(2)
beta_2 <- log(2)
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121221420p:plain
G1 < G2 = G3

G2 > G1 = G3

Group2においてのみ高発現の場合。

beta_1 <- log(2)
beta_2 <- log(1)
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121221520p:plain
G2 > G1 = G3

G2 < G1 = G3

Group2においてのみ低発現の場合。

beta_1 <- log(1/2)
beta_2 <- log(1)
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121221551p:plain
G2 < G1 = G3

G3 > G1 = G2

Group3においてのみ高発現の場合。

beta_1 <- log(1)
beta_2 <- log(2)
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121221621p:plain
G3 > G1 = G2

G3 < G1 = G2

Group3においてのみ低発現の場合。

beta_1 <- log(1)
beta_2 <- log(1/2)
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121221730p:plain
G3 < G1 = G2

全てのグループ間で発現変動

さらに続けて、全てのグループの間で発現変動している遺伝子について考えます。この場合もFolc changeについては
2を考えますが、一番発現が小さいグループと一番発現が多いグループのFolc changeは2^2 = 4となります。

G1 < G2 < G3
beta_1 <- log(2)
beta_2 <- log(4)
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121221923p:plain
G1 < G2 < G3

G1 < G3 < G2
beta_1 <- log(4)
beta_2 <- log(2)
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121221950p:plain
G1 < G3 < G2

G2 < G1 < G3
beta_1 <- log(1/2)
beta_2 <- log(2)
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121222034p:plain
G2 < G1 < G3

G3 < G1 < G2
beta_1 <- log(2)
beta_2 <- log(1/2)
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121222103p:plain
G3 < G1 < G2

G2 < G3 < G1
beta_1 <- log(1/4)
beta_2 <- log(1/2)
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121222127p:plain
G2 < G3 < G1

G3 < G2 < G1
beta_1 <- log(1/2)
beta_2 <- log(1/4)
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121222152p:plain
G3 < G2 < G1

Fold changeをランダムに設定する

これまではFold change(\beta)を固定して考えてきましたが、現実のRNA-seqデータをシミュレーションすることが目標なので、このパラメータをランダムに設定していきたいと思います。気をつけるべきところは、DEGの場合にランダムに設定してたまたまnonDEGとほぼ同じような発現データの状況になったということを避けることでしょう。そのために、発現変動があるというグループ間には
最低でも1.5倍の差が生じるようにしました。

G1 > G2 = G3
alpha <- log(100)
beta_1 <- beta_2 <- runif(1, min = -log(4), max = -log(1.5))
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121222228p:plain
G1 > G2 = G3

G1 < G2 = G3
alpha <- log(100)
beta_1 <- beta_2 <- runif(1, min = log(1.5), max = log(4))
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121222314p:plain
G1 < G2 = G3

G2 > G1 = G3
alpha <- log(100)
beta_1 <- runif(1, min = log(1.5), max = log(4))
beta_2 <- log(1)
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121222345p:plain
G2 > G1 = G3

G2 < G1 = G3
alpha <- log(100)
beta_1 <- runif(1, min = -log(4), max = -log(1.5))
beta_2 <- log(1)
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121222409p:plain
G2 < G1 = G3

G3 > G1 = G2
alpha <- log(100)
beta_1 <- log(1)
beta_2 <- runif(1, min = log(1.5), max = log(4))
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121222436p:plain
G3 > G1 = G2

G3 < G1 = G2
alpha <- log(100)
beta_1 <- log(1)
beta_2 <- runif(1, min = -log(4), max = -log(1.5))
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121222458p:plain
G3 < G1 = G2

G1 < G2 < G3
beta_1 <- runif(1, min = log(1.5), max = log(4))
beta_2 <- runif(1, min = beta_1 + log(1.5), max = beta_1 + log(4))
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121222523p:plain
G1 < G2 < G3

G1 < G3 < G2
beta_2 <- runif(1, min = log(1.5), max = log(4))
beta_1 <- runif(1, min = beta_2 + log(1.5), max = beta_2 + log(4))
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121222548p:plain
G1 < G3 < G2

G2 < G1 < G3
beta_1 <- runif(1, min = -log(4), max = -log(1.5))
beta_2 <- runif(1, min = log(1.5), max = log(4))
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121222608p:plain
G2 < G1 < G3

G3 < G1 < G2
beta_1 <- runif(1, min = log(1.5), max = log(4))
beta_2 <- runif(1, min = -log(4), max = -log(1.5))
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121222640p:plain
G3 < G1 < G2

G2 < G3 < G1
beta_2 <- runif(1, min = -log(4), max = -log(1.5))
beta_1 <- beta_2 + runif(1, min = -log(4), max = -log(1.5))
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121222704p:plain
G2 < G3 < G1

G3 < G2 < G1
beta_1 <- runif(1, min = -log(4), max = -log(1.5))
beta_2 <- beta_1 + runif(1, min = -log(4), max = -log(1.5))
X <- model.matrix(~gl(3, 3))
lambda <- exp(X %*% c(alpha, beta_1, beta_2))
plot(lambda, type = "h", lwd = 30, lend = 1, col = "#696969", xlim = c(0, 10), ylim = c(-5, max(lambda)), axes = F, ann = F)
text(c(2, 5, 8), c(-3, -3, -3), c("Group1", "Group2", "Group3"), adj = c(0.5, 1))
segments(c(0.6, 3.6, 6.6), c(-1, -1, -1), c(3.4, 6.4, 9.4), c(-1, -1, -1))

f:id:menseki:20191121222735p:plain
G3 < G2 < G1

疑似RNA-seqカウントデータ(3グループ)の生成

最後に3グループRNA-seqカウントデータのシミュレーションデータを生成してみたいと思います。ここまで、各発現パターンについて細かく考えてきましたが、シミュレーションデータを生成するだけなら、適当に\alpha, \beta_1, \beta_2を定めてやって各グループのデータの平均パラメータ\lambdaを求めてからそのベクトルの要素を適切に並び替えれば良いと思います。

あとはその平均パラメータをもとに負の二項分布に基づくノイズを加えてあげれば擬似的カウントデータの完成です。

以下には汎用性にかけますが、RNA-seqカウントデータ(3グループ)の生成コードを示します。

generateCountMeanMatrix <- function(Ngene, z) {
  X <- model.matrix(~gl(3,1))
  beta_1 <- runif(Ngene, min = log(1.5), max = log(4))
  beta_2 <- beta_1 + runif(Ngene, min = log(1.5), log(4))
  alpha <- rnorm(Ngene, 4, 1)
  tmp <- t(X %*% rbind(alpha, beta_1, beta_2))
  
  logMeanMat <- matrix(NA, nrow = Ngene, ncol = 9)
  # nonDEG
  logMeanMat[z ==  0,       ] <- tmp[z == 0, 1]
  # DEG type 1
  logMeanMat[z ==  1, 1:3   ] <- tmp[z == 1, 1]
  logMeanMat[z ==  1, -(1:3)] <- tmp[z == 1, 2]
  
  # DEG type 2
  logMeanMat[z ==  2,    1:3] <- tmp[z == 2, 2]
  logMeanMat[z ==  2, -(1:3)] <- tmp[z == 2, 1]
  
  # DEG type 3
  logMeanMat[z ==  3,    4:6] <- tmp[z == 3, 1]
  logMeanMat[z ==  3, -(4:6)] <- tmp[z == 3, 2]
  
  # DEG type 4
  logMeanMat[z ==  4,    4:6] <- tmp[z == 4, 2]
  logMeanMat[z ==  4, -(4:6)] <- tmp[z == 4, 1]
  
  # DEG type 5
  logMeanMat[z ==  5,    7:9] <- tmp[z == 5, 1]
  logMeanMat[z ==  5, -(7:9)] <- tmp[z == 5, 2]
  
  # DEG type 6
  logMeanMat[z ==  6,    7:9] <- tmp[z == 6, 2]
  logMeanMat[z ==  6, -(7:9)] <- tmp[z == 6, 1]
  
  # DEG type 7
  logMeanMat[z ==  7,    1:3] <- tmp[z == 7, 1]
  logMeanMat[z ==  7,    4:6] <- tmp[z == 7, 2]
  logMeanMat[z ==  7,    7:9] <- tmp[z == 7, 3]
  
  # DEG type 8
  logMeanMat[z ==  8,    1:3] <- tmp[z == 8, 1]
  logMeanMat[z ==  8,    4:6] <- tmp[z == 8, 3]
  logMeanMat[z ==  8,    7:9] <- tmp[z == 8, 2]
  
  # DEG type 9
  logMeanMat[z ==  9,    1:3] <- tmp[z == 9, 2]
  logMeanMat[z ==  9,    4:6] <- tmp[z == 9, 1]
  logMeanMat[z ==  9,    7:9] <- tmp[z == 9, 3]
  
  # DEG type 10
  logMeanMat[z ==  10,   1:3] <- tmp[z == 10, 3]
  logMeanMat[z ==  10,   4:6] <- tmp[z == 10, 1]
  logMeanMat[z ==  10,   7:9] <- tmp[z == 10, 2]
  
  # DEG type 11
  logMeanMat[z ==  11,   1:3] <- tmp[z == 11, 2]
  logMeanMat[z ==  11,   4:6] <- tmp[z == 11, 3]
  logMeanMat[z ==  11,   7:9] <- tmp[z == 11, 1]
  
  # DEG type 12
  logMeanMat[z ==  12,   1:3] <- tmp[z == 12, 3]
  logMeanMat[z ==  12,   4:6] <- tmp[z == 12, 2]
  logMeanMat[z ==  12,   7:9] <- tmp[z == 12, 1]
  
  logFC <- logMeanMat[, c(1, 4, 7)] - logMeanMat[, 1]
  
  list(logFC = logFC, logMeanMat = logMeanMat)
}


x <- generateCountMeanMatrix(Ngene = 20000, z = rep(0:12, c(14000, rep(500, 12))))
counts <- matrix(rnbinom(20000 * 9, size = 10, mu = exp(x$logMeanMat)), 20000, 9)
head(counts)

##>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
##> [1,]   45   47   52   39   52   28   68   33   41
##> [2,]   86   67   65   35   45   52   84   43   41
##> [3,]  106  197  140  200  101  196   66  126  161
##> [4,]    9    6   12   18   10   10    6   19   17
##> [5,]   24   38   30   63   46   53   27   46   25
##> [6,]  120  237  246  157  270  396  329  163  189

heatmap(counts, Rowv = NA, Colv = NA)

f:id:menseki:20191121222806p:plain

それっぽいデータが得られました。