kids | The stupidest thing...

Posts Tagged ‘kids’

Randomized Hobbit

22 Jun 2015

@wrathematics pointed me to his ngram R package for constructing and simulating from n-grams from text.

I’d recently grabbed the text of the hobbit, and so I applied it to that text, with amusing results.

Here’s the code I used to grab the text.

library(XML)
stem <- "http://www.5novels.com/classics/u5688"
hobbit <- NULL
for(i in 1:74) {
    cat(i,"\n")
    if(i==1) {
        url <- paste0(stem, ".html")
    } else {
        url <- paste0(stem, "_", i, ".html")
    }

    x <- htmlTreeParse(url, useInternalNodes=TRUE)
    xx <- xpathApply(x, "//p", xmlValue)
    hobbit <- c(hobbit, gsub("\r", "", xx[-length(xx)]))
    Sys.sleep(0.5)
}

Then calculate the ngrams with n=2.

library(ngram)
ng2 <- ngram(hobbit, n=2)

Simulate some number of words with babble(). If you use the seed argument, you can get the result reproducibly.

babble(ng2, 48, seed=53482175)

into trees, and then bore to the Mountain to go through?” groaned the hobbit. “Well, are you doing, And where are you doing, And where are you?” it squeaked, as it was no answer. They were surly and angry and puzzled at finding them here in their holes

Update: @wrathematics suggested that I mix two texts, so here’s a bit from the Hobbit in the Hat (The Hobbit with 59× Cat in the Hat — up-sampled to match lengths.) But there’s maybe not enough overlap between the two texts to get much of a mixture.

“I am Gandalf,” said the fish. This is no way at all!

already off his horse and among the goblin and the dragon, who had remained behind to guard the door. “Something is outside!” Bilbo’s heart jumped into his boat on to sharp rocks below; but there was a good game, Said our fish No! No! Those Things should not fly.

Tags:code, kids, R, textmining
Posted in R | 3 Comments »

More on Chutes & Ladders

20 May 2013

Matt Maenner asked about the sawtooth pattern in the figure in my last post on Chutes & Ladders.

Damn you, Matt! I thought I was done with this. Don’t feed my obsession.

My response was that if the game ends early, it’s even more likely that it’ll be the kid who went first who won. But, my intuition was wrong: exactly the opposite is true. It is the advantage to the first player that causes the sawtooth pattern, but that advantage increases with the number of rounds rather than decreases.

Numerical results

While it’s fast and easy to study the Chutes & Ladders game by simulation, if you want to answer questions more precisely, it’s best to switch to more exact results.

Consider a single individual playing the game, and let X_n be his/her location at round n. The X_n form a Markov chain, in that the future (X_n+1), given the present (X_n), is conditionally independent of the history (X₁, …, X_n-1).

It’s relatively easy to construct the transition matrix of the chain. (See my R code.) This is a matrix P, with P_ij = Pr(X_n+1 = j | X_n = i).

Then the probability that a player has reached state 100 by round n is
(1, 0, …, 0) Pⁿ (0, …, 0, 1)’. That’s the cumulative distribution function (cdf) of the number of rounds for a single player to finish the game. Call this q_n. You can get the probability distribution by differences, say p_n = q_n – q_n-1.

To calculate the number of rounds to complete a game with k players, you want the minimum of k independent draws from this distribution. The probability that a game with k players is complete by round n is 1 – (1-q_n)^k. And again you can get the probability distributions by differences. Here’s a picture.

Advantage to the first player

Now, regarding the advantage to the first player: note that the first player wins in exactly n steps if he gets to the finish at n steps and none of the other players are done by n-1 steps. So, with k players, the probability that the first player wins in exactly n steps is p_n (1-q_n-1)^k-1.

The chance that the second player wins in exactly n steps is (1-q_n) p_n (1-q_n-1))^k-2, with the last term included only if there are k > 2 players.

From this idea, it’s straightforward to calculate the probability that the first player wins given that the game is complete at round n. Here’s a plot of that probability as a function of the number of players, relative to the nominal probability (1/2, 1/3, 1/4).

Note that n=7 is the minimum number of rounds to complete the game. I’d thought that the first player’s advantage went down over time, but the opposite is true.

No. spins to end the game

Combining these two results (on the number of rounds to complete the game and the probability that player i will win in n rounds), we can get a more precise version of the simulation-based figure in my last post:

As you can see, the sawtooth pattern becomes more pronounced with the number of rounds, but then it gets lost in the downward slope of the distribution on the right side. (Again, see my R code.)

Tags:kids, Markov chains
Posted in R, Statistics | 1 Comment »

Chutes & ladders: How long is this going to take?

17 May 2013

I was playing Chutes & Ladders with my four-year-old daughter yesterday, and I thought, “How long is this going to take?”

I saw an interesting mathematical analysis of the game a few years ago, but it seems to be offline, though you can read it via the wayback machine.

But that didn’t answer my specific question, namely, “How long is this going to take?”

So I wrote a bit of R code to simulate the game.

Here’s the distribution of the number of spins to complete the game, by number of players:

With two players, the average number of spins is 52, with a 90th percentile of 88.

If you add a third player, the average increases to 65, and the 90th percentile increases to 103. You’re playing fewer rounds, but each round is three times as long. If you add a fourth player, the average is 76 and the 90th percentile is 117.

So, in trying to minimize the agony, it seems best to not encourage my eight-year-old son to join us in the game. If he plays with us, there’s a 63% chance that it will take longer.

And that’s particularly true because then the chance of my daughter winning drops from about 1/2 to about 1/3.

That raises another question: if I let her go first, what advantage does that give her? Not much. The chance that the person who goes first will win is 50.9%, 34.4%, and 25.9%, respectively, when there are 2, 3, and 4 players. So not a noticeable amount. Thus I cheat (on her behalf). Really, though, I’m cheating in order to shorten the game as much as to ensure that she wins.

Note: There’s a close connection between this problem and my work on the multiple-strain recombinant inbred lines. (See this and that.) I’m tempted to play around with it some more.

Additional numerical results here.

Tags:kids, Markov chains, simulation
Posted in R, Statistics | 7 Comments »

The stupidest thing…

Posts Tagged ‘kids’

Randomized Hobbit

More on Chutes & Ladders

Numerical results

Advantage to the first player

No. spins to end the game

Chutes & ladders: How long is this going to take?

Pages

Recent Posts

Archives

Categories