Informatica Training & Certification
- 15k Enrolled Learners
- Live Class
As an ‘Avid Googler’ whose day just can’t start without typing something on the search bar, I must acknowledge the fact that there are millions of users like me who can’t imagine an alternative to the search behemoth. In the rarest of rare cases where I am not able to find what I am looking for, there is a possibility that I might be a little promiscuous in terms of venturing out of Google. Thoughts like ‘Bing search, Yahoo search or even Duckduckgo’ often criss-cross my mind as I look for a search-alternative.
This has got me curious into understanding what are the other search engines? How does a search engine work? Can one make his/her own search engine? What exists beyond?
Back in 1999, Doug Cutting started work on a project titled ‘Lucene’ which was meant to provide an alternative for Google and Yahoo search engines. The industry needed an open-source search engine given its scarce availability that time. It’s interesting to know that Lucene (a software library) provides the basic building blocks for search engine that was crucial in propelling ‘Solr’ forward.
Solr subsequently was created in 2004 by Yonik Seeley based on Lucene. SOLAR (Search on Lucene and Resin, now termed as Solr) became open source in 2006 and is a part of the Apache Storm foundation now.
For the layman, Solr is seen more as a platform that supports websites since it can index and search multiple sites and return recommendations for related content based on the query. It is a popular search platform, especially when it comes to enterprise search as it can be used to index, search documents and email attachments.
The relationship between Apache Lucene and Solr can be compared to car and engine, literally. Apache Lucene is a vast information retrieval library that was initially developed by Doug Cutting ( Author of Hadoop) that specialized in Indexing and Searching. Its unique advantage was it was scalable and performed well. It provided advanced search options like synonyms, stop-words, based on similarity and proximity. And it is an open-source!
It would actually be pointless to compare Solr to Google Search, a key reason being that the Solr Project was not meant to compete with Google’s public search engine, but with Google’s search appliances and other products that help enterprises. To break it down further, organizations today are investing large amount of resources for networks and servers, storing Peta-bytes of big data. Once the primary challenge is addressed, the next one is ‘Fetching/Retrieving Information’. This has often been addressed by Solr which is used to find stuff inside an organization’s own private network and hence the demand for Enterprise Search. It doesnt’ come as a surprise that the search and navigation features of many of the world’s largest sites like Aol, Yahoo, Cnet are powered by Solr.
The above graph highlights how the interest in Solr has picked up since 2005. It is most likely propelled by the search for open-source search engine.
Given its unique advantage of empowering search engine, Solr has seen wide adoption among market-leaders in various domains. Popular sites like Netflix, Ebay, Simplihired, Twitter, LinkedIn and Flipkart are active users of Solr.
Imagine the late Steve Jobs using a Samsung Phone for his requirements or even Boo-Keun Yoon (CEO, Samsung Group) using an iPhone for personal use, it would definitely be hard to digest.
Yet, one interesting fact about Solr is that it is used by Google.
Now, we better not jump to conclusions when I say Google using Solr. First of all Google is not entirely abandoning its search engine and secondly it is using Solr for a totally different purpose. Google’s ‘ All for Good’ is a non profit organization which lists volunteer opportunities enhanced with Solr.
So why adopt Solr when they can use their own search engine?
Let’s take a deeper look. One of the challenges of ‘All for Good’ site is that the volunteer list is not updated on the site as frequently as needed. According to Google’s Public Sector team, the Crawlers don’t immediately update and they take time to find new information. This has typically been addressed by Solr which helps Non Profits to see opportunities indexed faster and the user hence can see more relevant results.
Good for a start. Welcome Solr to the Search Engine world.
Got a question for us? Please mention them in the comments section and we will get back to you.