Manipulate character string using gsub() and perform multivariate data cleaning efficiently in R

0 votes

Manipulate character string using gsub() and perform multivariate data cleaning efficiently in R. 

  • Convert all the million values into Billion as 1 M = 0.001 B
  • Remove unnecessary symbols B, M and have only numerical values.
Nov 13, 2018 in Data Analytics by Ali
• 10,380 points
15 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

gsubfn is perfect for this task:

library(gsubfn)
as.vector(sapply(gsubfn("[A-Z]", list(B="* 1", M= "* 1e-3"), x), 
                                      function(x) eval(parse(text=x))))
#[1] 1.200 2.500 0.808

data

x <- c("1.2 B", "2.5 B", "808 M")
answered Nov 13, 2018 by Maverick
• 10,020 points

Related Questions In Data Analytics

0 votes
1 answer

How to forecast season and trend of data using STL and ARIMA in R?

You can use the forecast.stl function for the ...READ MORE

answered May 18, 2018 in Data Analytics by DataKing99
• 8,100 points
414 views
0 votes
1 answer
0 votes
1 answer

How to use group by for multiple columns in dplyr, using string vector input in R?

dplyr added versions for group_by. This allows you ...READ MORE

answered Apr 12, 2018 in Data Analytics by CodingByHeart77
• 3,680 points

edited Apr 12, 2018 by CodingByHeart77 1,917 views
0 votes
1 answer

How can I print string and variable contents on the same line using R?

There are two options for doing so.  You ...READ MORE

answered May 9, 2018 in Data Analytics by zombie
• 3,690 points
15 views
0 votes
1 answer

How do I remove unnecessary redundant data from a dataset?

You can use dimensionality reduction methods such as ...READ MORE

answered Nov 13, 2018 in Data Analytics by Maverick
• 10,020 points
28 views
0 votes
1 answer

Clean and standardize words using R

You might want to checkout the stringdist package, e.g.: library(stringdist) toMatch ...READ MORE

answered Nov 13, 2018 in Data Analytics by Maverick
• 10,020 points
15 views
0 votes
1 answer

Cleaning raw data

Try this using read.fwf d <- read.fwf(textConnection( " ...READ MORE

answered Nov 13, 2018 in Data Analytics by Ali
• 10,380 points
18 views
0 votes
1 answer

Getting rid of extra periods - cleaning data using R

Just try removing the periods using sub ...READ MORE

answered Nov 13, 2018 in Data Analytics by Maverick
• 10,020 points
14 views
0 votes
1 answer

Replace comma with a period in data cleaning using R

You can use the scan function in ...READ MORE

answered Nov 13, 2018 in Data Analytics by Maverick
• 10,020 points
21 views
0 votes
1 answer

Cleaning a Data Frame Using Regexp in R

The simplest way: library(dplyr) library(stringi) df %>% mutate(NUMERO_APPEL.fix = ...READ MORE

answered Nov 13, 2018 in Data Analytics by Maverick
• 10,020 points
17 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.