How to use Python to read block of data in txt file and convert it to structured data

0 votes

Recently I am building a exam test system and now I have the questions and answers organized as follow in a txt file:

Q1

All of the following are basic components of a security policy EXCEPT the

A. definition of the issue and statement of relevant terms.

B. statement of roles and responsibilities.

C. statement of applicability and compliance requirements.

D. statement of performance of characteristics and requirements.

Answer: D

Explaination: Policies are considered the first and highest level of documentation, from which the lower level elements of standards, procedures, and guidelines flow. This order, however, does not mean that policies are more important than the lower elements. These higher-level policies, which are the more general policies and statements, should be created first in the process for strategic reasons, and then the more tactical elements can follow. -Ronald Krutz The CISSP PREP Guide (gold edition) pg 13

Q2

Ensuring the integrity of business information is the PRIMARY concern of

A. Encryption Security

B. Procedural Security.

C. Logical Security

D. On-line Security

Answer: B

Explaination: Procedures are looked at as the lowest level in the policy chain because they are closest to the computers and provide detailed steps for configuration and installation issues. They provide the steps to actually implement the statements in the policies, standards, and guidelines...Security procedures, standards, measures, practices, and policies cover a number of different subject areas. - Shon Harris All-in-one CISSP Certification Guide pg 44-45

Q3

Which one of the following is an important characteristic of an information security policy?

A. Identifies major functional areas of information.

B. Quantifies the effect of the loss of the information.

C. Requires the identification of information owners.

D. Lists applications that support the business function.

Answer: A

Explaination: Information security policies area high-level plans that describe the goals of the procedures. Policies are not guidelines or standards, nor are they procedures or controls. Policies describe security in general terms, not specifics. They provide the blueprints for an overall security program just as a specification defines your next product - Roberta Bragg CISSP Certification Training Guide (que) pg 206

What I want to do is that I want to transform my each question to structured data format (which you can see as follow) so that I can store them in database.

organized format

I want to use Python to complete this task and I am sort of know I need to use regular expression to deal with it but I just don't know how to do.

Can anyone help with this? Your help would be really appreciated! Thanks!

Apr 19, 2023 in Cyber Security & Ethical Hacking by anish
• 400 points
868 views

1 answer to this question.

0 votes

Okay, I understand. To extract structured data from the text file, you can use regular expressions to match the patterns of the questions and answers. Here is a sample code that might help you get started:

import re

# Read the text file
with open('questions.txt', 'r') as file:
    text = file.read()

# Define regular expression patterns
pattern_q = r'Q\d+.*?(?=Q\d+|$)'
pattern_a = r'[A-D]\..*?(?=Answer:|$)'
pattern_ans = r'Answer:\s([A-D])'
pattern_exp = r'Explaination:\s(.*)'

# Match the patterns in the text and extract the data
questions = []
for match_q in re.findall(pattern_q, text, re.DOTALL):
    question = {}
    question['id'] = match_q[:2].strip()
    question['text'] = match_q[2:].strip()
    question['answers'] = []
    for match_a in re.findall(pattern_a, match_q):
        question['answers'].append(match_a.strip())
    match_ans = re.search(pattern_ans, match_q)
    if match_ans:
        question['correct_answer'] = match_ans.group(1)
    match_exp = re.search(pattern_exp, match_q)
    if match_exp:
        question['explanation'] = match_exp.group(1)
    questions.append(question)

# Print the structured data
print(questions)

This code uses regular expressions to match the patterns of the questions and answers, and then extracts the data into a list of dictionaries, where each dictionary represents a single question with its associated answers, correct answer, and explanation. You can modify this code as per your requirements to store the data in a database.

answered Apr 19, 2023 by Edureka
• 12,690 points

Related Questions In Cyber Security & Ethical Hacking

+1 vote
1 answer

Not able to use nmap in python.

nmap module doesn’t have PortScanner attribute. The ...READ MORE

answered Jan 28, 2019 in Cyber Security & Ethical Hacking by Omkar
• 69,210 points
2,890 views
+1 vote
1 answer
+1 vote
1 answer

How to find IP address of nodes in my network?

The IP address of the nodes connected ...READ MORE

answered Feb 9, 2019 in Cyber Security & Ethical Hacking by Omkar
• 69,210 points
4,323 views
0 votes
1 answer
0 votes
2 answers
+1 vote
2 answers

how can i count the items in a list?

Syntax :            list. count(value) Code: colors = ['red', 'green', ...READ MORE

answered Jul 7, 2019 in Python by Neha
• 330 points

edited Jul 8, 2019 by Kalgi 4,074 views
0 votes
1 answer
0 votes
1 answer

how to start a career in cyber security?

Many of us are familiar with the ...READ MORE

answered Dec 14, 2021 in Cyber Security & Ethical Hacking by Edureka
• 12,690 points
383 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP