How to select columns containing specific text in R

0 votes

Hi to all,

I have data like this in table

IND snp1 snp2 snp3 snp4 snp5
1 A/G T/T T/C G/C G/G
2 A/A C/C G/G G/G A/A
3 T/T G/G C/C C/C T/T

and I wish to select only snp1,snp3, and snp4 containing different letters combinations like A/G, T/C AND G/C. can anyone help me to select only those columns? any help in this regard is highly appreciated

Thanks in advance

Aug 7, 2020 in Data Analytics by tomato
• 120 points

edited Aug 7, 2020 by MD 1,313 views

1 answer to this question.

0 votes

Hi,

You can use the select method to extract particular columns and rows. According to your dataset, you can use the below-given code.

> df
  IND snp1 snp2 snp3 snp4 snp5
1   1  A/G  T/T  T/C  G/C  G/G
2   2  A/A  C/C  G/G  G/G  A/A
3   3  T/T  G/G  C/C  C/C  T/T

> select(df, snp1,snp3,snp4)[1,1:3]
  snp1 snp3 snp4
1  A/G  T/C  G/C

I hope this will give you the approach.

answered Aug 7, 2020 by MD
• 95,140 points
Hi

Thanks for sharing your time to help me. Yes i can do that using select function but my case is different i have 20,000 columns and i do not which column contains this type of character (A/T,A/G,A/C,T/A,T/G,T/C,G/A,G/T,G/C,C/A,C/T,C/G), first i need identify columns containing these characters A/T,A/G,A/C,T/A,T/G,T/C,G/A,G/T,G/C,C/A,C/T,C/G in my data file and extracting them into different file. I hope i explained well. once again thanks lot for your help.
i tried this code to extract rows containing specific text across columns

collist <- c("SNP1","SNP2","SNP3","SNP4","SNP5")
sel <- apply(red[,collist],1,function(row) length(grep("T/C",row))>0)
red[sel,]

now i want extract columns containing "T/C" across columns SNP1 to SNP5 and i applied function like this

sel <- apply(red[,collist],2,function(cols) length(grep("A/G",cols))>0)
red[sel,]

but this is not working for me to get required out put i.e. snp3. can anyone help to sort this out?

thanks in advance for your help
Hi,

Can you explain your query a little bit more?
Dear MD

good morning

what ever you did is right and i can give column numbers if my data is small but now i have more than 20000 columns and i do not know which columns of my data having these  this type of character (A/T,A/G,A/C,T/A,T/G,T/C,G/A,G/T,G/C,C/A,C/T,C/G. i would like to mention columns containing these characters instead of column numbers and able to subset those columns only having above characters..

I hope i explained well

Thanks in advance

Hi,

You need to create your own customized script. Like your script will check every column and search your pattern if it matches then append the column name in a list otherwise it will skip that column.

Related Questions In Data Analytics

0 votes
1 answer
0 votes
1 answer

How to convert a text mining termDocumentMatrix into excel or csv in R?

By assuming that all the values are ...READ MORE

answered Apr 5, 2018 in Data Analytics by DeepCoder786
• 1,720 points
630 views
0 votes
1 answer

How to join two tables (tibbles) by *list* columns in R

You can use the hash from digest ...READ MORE

answered Apr 5, 2018 in Data Analytics by kappa3010
• 2,090 points
515 views
+1 vote
2 answers

How to sort a data frame by columns in R?

You can use dplyr function arrange() like ...READ MORE

answered Aug 21, 2019 in Data Analytics by anonymous
• 32,930 points
606 views
0 votes
3 answers

How can I add YAML current date in rmarkdown ?

<$today:MM/dd/yyyy> READ MORE

answered Mar 25, 2019 in Data Analytics by Anupam Das
6,700 views
0 votes
1 answer

How to pass command line arguments to run a Rscript

1. For taking an argument from the ...READ MORE

answered Aug 6, 2018 in Data Analytics by Anmol
• 1,780 points
4,063 views
+10 votes
3 answers

Which is a better initiative to learn data science: Python or R?

Well it truly depends on your requirement, If ...READ MORE

answered Aug 8, 2018 in Data Analytics by Abhi
• 3,720 points
300 views
+1 vote
4 answers

Python vs. R for data science

I would say both Python and R ...READ MORE

answered Aug 1, 2019 in Data Analytics by briny
422 views
0 votes
1 answer

How to extract specific columns from a dataframe in R?

Hi@akhtar, You can use the select method to ...READ MORE

answered Aug 7, 2020 in Data Analytics by MD
• 95,140 points
223 views
+4 votes
3 answers

How to sum a variable by group in R?

You can also try this way, x_new = ...READ MORE

answered Jul 31, 2019 in Data Analytics by Cherukuri
• 32,930 points
57,182 views