Hadoop intervals and JOIN

0 votes

I'm currently trying to join two sources of data where the key is an interval (say [date-begin/date-end]). For example:

input1:

20091001-20091002    A
20091011-20091104    B
20080111-20091103    C
(...)

input2:

20090902-20091003    D
20081015-20091204    E
20040011-20050101    F
(...)

I'd like to find all the records where the key1 overlaps the key2. Is it possible with hadoop ? Where can I find an example of implementation ?

Sep 24, 2018 in Big Data Hadoop by digger
• 26,700 points
141 views

1 answer to this question.

0 votes

Hey, a solution was given on Biostar: http://biostar.stackexchange.com/questions/8821. Hope this helps

answered Sep 24, 2018 by slayer
• 29,300 points

Related Questions In Big Data Hadoop

0 votes
10 answers

What is the difference between Mongodb and Hadoop?

Apart from the similarity that they are ...READ MORE

answered Dec 6, 2018 in Big Data Hadoop by Deeraj
7,466 views
0 votes
1 answer

How can I download only hdfs and not hadoop?

No, you cannot download HDFS alone because ...READ MORE

answered Mar 15, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
346 views
+2 votes
10 answers

Is there any difference between “hdfs dfs” and “hadoop fs” shell commands?

Yes, there's a difference between hadoop fs and ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Kunal
20,810 views
0 votes
1 answer

How to install and configure a multi-node Hadoop cluster?

I would recommend you to install Cent ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by Shubham
• 13,480 points
1,308 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
7,229 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
53,034 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
2,740 views
0 votes
1 answer
0 votes
1 answer

Hadoop Java Error: java.lang.NoClassDefFoundError: WordCount (wrong name: org/myorg/WordCount)

Hey, try this code import java.io.IOException; import java.util.Iterator; import java.util.StringTokenizer; import ...READ MORE

answered Sep 19, 2018 in Big Data Hadoop by slayer
• 29,300 points
3,383 views
+3 votes
5 answers

Hadoop DistributedCache is deprecated - what is the preferred API?

I had the same problem. And not ...READ MORE

answered Oct 12, 2018 in Big Data Hadoop by Rohan
884 views