Thursday, December 8, 2011

ISTA 301 Blog: Briannatomaton

Part of one of my Computer Science 227 assignments was to implement a probabilistic text generator. That is, a program that scans through a large text file, finds patterns in which letters tend to follow which other letters most often, and then generates a new text semi-randomly based on those probabilities. It's loads of fun, as it tends to produce mad-libs style nonsense but in a prose style similar to that of the text that fed it.

Oddly enough, as an ISTA student, I'd already been exposed to this idea in Paul Cohen's class a few semesters ago, and shortly before the 227 assignment was posted, I joked with Brianna about setting up a text generator based on her writing. When it turned out that the final project in one of my classes was to implement this very idea, I was ecstatic.

What it does is read every substring of n length in a large text and then builds a hash table containing list of every character that follows that substring, weighted according to the frequency of its occurrence. I use Google Voice for my text messages, so I was able to download an HTML archive of all the text messages Brianna has sent me in the last 6 months. 15 minutes and a little studying up on regular expressions later, and I had removed all of my own responses as well as all the HTML tags and just and a solid plaintext block of Brianna's texts. I fed this, along with some of her blog and a few papers she wrote as an undergrad into a gigantic text file which totalled about 82,000 words.

Briannatomaton was born. I modified the code slightly so that I could force it to "seed" each randomly generated blurb with a word or two at the beginning, which let me have it "talk to" specific names that occur in the text I fed into it. Why would I want to do that? Simple. Because Briannatomaton got her own facebook page.

10 comments:

  1. ELOoffice 11.02.006 Crack Full License Keys Free Download 2022 ... 11.02.006 Crack With Activation Keys Full Version Download Latest 2022.ELOoffice 11.02.006 + Crack 2022 Free Download

    ReplyDelete
  2. Merry Christmas and happy new year Wishes Images Gif both are the most ... Merry Christmas And Happy Holidays 2022 Celebrations Planning.Christmas Card Messages

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Enterprise Risk Management Software is a type of software solution designed to help organizations identify, assess, and manage risks across their entire operations. It provides a systematic and structured approach to risk management, enabling companies to proactively address potential risks and minimize their impact on business objectives.

    ReplyDelete