I have a code that I want to use alter values between two columns in my dataset

0 votes

I have a data set which has "Speed" as one of the columns (features). The column contains both zeros and non-zero values. I want to randomly set 10% of the non-zero values to zeros. This will change the corresponding class label to be zeros. I mean any value that is set to zero, its corresponding class value will be zero. I have done this but it is give me errors below the error.

file_path = 'Processed_data/data1.csv'  
df = pd.read_csv(file_path)  
per_change = 0.1  
attr = 'Speed'  
target = 'Class'  
df_spd = df[df['Speed'] > 0.]  

num_rows_to_change = int(df.shape[0] * per_change)  
num_with_zero_initial = df[df[attr] == 0].shape[0]  
assert df_spd.shape[0] > num_rows_to_change, \  
'Number of rows with non-zero speed is less than 10% of the original dataset.'
df_update = df_spd.sample(num_rows_to_change)
df_update[attr] = 0.
df_update[target] = 0.
update_list = df_update.index.tolist()
num_with_zero_final = df[df['Speed'] == 0].shape[0]
assert num_with_zero_final == num_with_zero_initial + num_rows_to_change, \
'Number of rows needed to change not equal to number of rows changed.'

Traceback (most recent call last)
<ipython-input-11-f93535705bac> in <module>
1 assert num_with_zero_final == num_with_zero_initial + num_rows_to_change, \
----> 2 'Number of rows needed to change not equal to number of rows changed.'
AssertionError: Number of rows needed to change not equal to number of rows changed.

Mar 17, 2019 in Python by elvin
• 130 points

1 answer to this question.

+1 vote

Hi @elvin. I read your script and found that your approach is a little complex. I have written a simple script to do your job. Try this:

import pandas as pd
file_path = #path to your file
df = pd.read_csv(file_path)

change = df.query('Speed>0').sample(frac=.1).index
df.loc[change, 'Speed'] = 0
df.loc[change, 'Class'] = 0

df.to_csv('data1.csv', header=True, index=False)

Let me know if this does what you want.

answered Mar 17, 2019 by Omkar
• 69,030 points
Thanks a lot Omkar. The code perfectly did the job

Related Questions In Python

0 votes
1 answer
+1 vote
1 answer
+1 vote
2 answers

how can i count the items in a list?

Syntax :            list. count(value) Code: colors = ['red', 'green', ...READ MORE

answered Jul 6, 2019 in Python by Neha
• 330 points

edited Jul 8, 2019 by Kalgi 1,560 views
0 votes
0 answers
0 votes
2 answers