Create a Markov Model that can generate text simulations by studying Donald Trump speech data set

0 votes
I have a project and I need help with it. I have the following problem statement

"To apply Markov Property and create a Markov Model that can generate text simulations by studying Donald Trump speech data set."
Aug 2, 2019 in Machine Learning by Sameer

1 answer to this question.

0 votes

The logic here is simple. Apply Markov Property to generate Donald’s Trump’s speech by considering each word used in the speech and for each word, create a dictionary of words that are used next.

I am not just giving you the code for your project, I think you should understand the concept and I am going to try my best for that. 

1. Start with importing the required libraries, with the following command

import numpy as np

2. Read the datasets

trump = open('C://Users//NeelTemp//Desktop//demos//speeches.txt', encoding='utf8').read()
#display the data

3. Split the datasets into individual words

corpus = trump.split()
 #Display the corpus

4. Next, create a function that generates the different pairs of words in the speeches. To save up space, we’ll use a generator object.

def make_pairs(corpus):
for i in range(len(corpus) - 1):
yield (corpus[i], corpus[i + 1])
pairs = make_pairs(corpus)

5. Next, let’s initialize an empty dictionary to store the pairs of words.

word_dict = {}
for word_1, word_2 in pairs:
if word_1 in word_dict.keys():
word_dict[word_1] = [word_2]

6. Build the model. We'll randomly start picking up words from the corpus and start forming the chain. 

#randomly pick the first word
first_word = np.random.choice(corpus)
 #Pick the first word as a capitalized word so that the picked word is not taken from in between a sentence
while first_word.islower():
 #Start the chain from the picked word
chain = [first_word]
 #Initialize the number of stimulated words
n_words = 20

7. Finally, let's display the stimulated text

#Join returns the chain as a string
print(' '.join(chain))

And you are done! Congratulations. Have a look at this blog for a better understanding of this concept. 

answered Aug 2, 2019 by Zaid

Related Questions In Machine Learning

+1 vote
0 answers

text mining new set of data in production environment expect training feature

Hi, I have trained a model based on ...READ MORE

Nov 28, 2019 in Machine Learning by MANOJ
• 130 points
0 votes
1 answer

How can I train a model and calculate the accuracy of CBR algorithm?

Hi@Abubakar, You can find lots of documents on ...READ MORE

answered Oct 17, 2020 in Machine Learning by MD
• 95,440 points
0 votes
1 answer

If both negative and positive skewness are present in data set,then how it can be removed??

Hi@shama, It depends on your use case. If ...READ MORE

answered Dec 8, 2020 in Machine Learning by MD
• 95,440 points
0 votes
1 answer

How do I create a linear regression model in Weka without training?

Weka is a classification algorithm. This is ...READ MORE

answered Mar 9, 2022 in Machine Learning by Nandini
• 5,480 points
0 votes
2 answers
+1 vote
2 answers

how can i count the items in a list?

Syntax :            list. count(value) Code: colors = ['red', 'green', ...READ MORE

answered Jul 7, 2019 in Python by Neha
• 330 points

edited Jul 8, 2019 by Kalgi 4,168 views
0 votes
1 answer
+5 votes
6 answers

Lowercase in Python

You can simply the built-in function in ...READ MORE

answered Apr 11, 2018 in Python by hemant
• 5,790 points
0 votes
1 answer
0 votes
1 answer
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP