Hadoop intervals and JOIN

0 votes

I'm currently trying to join two sources of data where the key is an interval (say [date-begin/date-end]). For example:

input1:

20091001-20091002    A
20091011-20091104    B
20080111-20091103    C
(...)

input2:

20090902-20091003    D
20081015-20091204    E
20040011-20050101    F
(...)

I'd like to find all the records where the key1 overlaps the key2. Is it possible with hadoop ? Where can I find an example of implementation ?

Sep 24, 2018 in Big Data Hadoop by digger
• 26,720 points
183 views

1 answer to this question.

0 votes

Hey, a solution was given on Biostar: http://biostar.stackexchange.com/questions/8821. Hope this helps

answered Sep 24, 2018 by slayer
• 29,310 points

Related Questions In Big Data Hadoop

0 votes
10 answers

What is the difference between Mongodb and Hadoop?

MongoDB is a NoSQL database, whereas Hadoop is ...READ MORE

answered Jun 20, 2018 in Big Data Hadoop by jenny_code
8,688 views
0 votes
1 answer

How can I download only hdfs and not hadoop?

No, you cannot download HDFS alone because ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
453 views
+2 votes
10 answers

Is there any difference between “hdfs dfs” and “hadoop fs” shell commands?

Yes, there's a difference between hadoop fs and ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Kunal
23,089 views
0 votes
1 answer

How to install and configure a multi-node Hadoop cluster?

I would recommend you to install Cent ...READ MORE

answered Mar 22, 2018 in Big Data Hadoop by Shubham
• 13,480 points
1,508 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
8,051 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
67,132 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
3,063 views
0 votes
1 answer
0 votes
1 answer

Hadoop Java Error: java.lang.NoClassDefFoundError: WordCount (wrong name: org/myorg/WordCount)

Hey, try this code import java.io.IOException; import java.util.Iterator; import java.util.StringTokenizer; import ...READ MORE

answered Sep 19, 2018 in Big Data Hadoop by slayer
• 29,310 points
3,849 views
+3 votes
5 answers

Hadoop DistributedCache is deprecated - what is the preferred API?

I had the same problem. And not ...READ MORE

answered Oct 12, 2018 in Big Data Hadoop by Rohan
1,048 views