Categories

Versions

Using the Solr Connector

The Solr connector allows you to read search results from a Solr server. Using theSearch Solroperator you can run different search queries. This document will walk you through how to:

Install the Solr extension

First, you need to install the Solr Extension:

Connect to Your Solr Server

Before you can use the Solr connector, you have to configure a new Solr connection. For this purpose, you will need the connection details of your Solr server. Typically, the Solr server URL ends with the string '/solr'. If your Solr server requires authentication, you will also need valid credentials.

  1. In RapidMiner Studio, right-click on the repository you want to store your Solr connection in and chooseNew Connection IconCreate Connection

    You can also click onConnections >New Connection IconCreate Connectionand select therepositoryfrom the dropdown of the following dialog.

  2. Enter a name for the new connection, and setConnection TypetoSolr IconSolr:

  3. Click onCreate IconCreateand switch to theSetuptab in theEdit connectiondialog.

  4. Fill in the connection details of your Solr server:

    The preconfigured URL is the default URL for a Solr server running on your local machine. Note that Solr does not require user authentication by default but you can specify the username and password by selectingUses authentication

    While not required, we recommend testing your new Solr connection by clicking theConnection Test IconTest connectionbutton. If the test fails, please check whether the details are correct.

  5. ClickSave IconSaveto save your connection and close theEdit connectiondialog.

You can now use the newly created connection with the Solr operators!

Search your Solr server

There are two searching operators for Solr,Search Solr (Data)andSearch Solr (Documents).TheSearch Solr (Data)operator allows to query Solr servers and obtain the results as a data table. TheSearch Solr (Documents)operator works similar but supplies the data as a collection of documents that can be processed further with the Text extension. We will demonstrate the configuration for theSearch Solr (Data)operator, it can also be applied toSearch Solr (Documents)

  1. Open a new processNew Process Iconin RapidMiner Studio, drag theSearch Solr (Data)operator into theProcessview, and connect its output port to the result port of the process: Select your Solr connection for theconnection entryparameter from the connections folder of the repository you stored it in by clicking on therepository chooser iconbutton next to it:

    Alternatively, you can drag the Solr connection from the repository into theProcess Paneland connect the resulting operator with theRead Solroperator.

  2. Select a collection from the list of thecollectionparameter.

  3. Define the search query by clicking on the button next to thequeryparameter. You can add filters to refine your query. If there is no parameterfilter queryvisible click on展示先进的参数to display it.

  4. Optionally, you can specify advanced parameters like data facets for afaceted search.Note that you can change the defaultlimitof 100 for the maximal number of results.

  5. RunRun Processthe process! In the Result Perspective, you can see the table resulting from your query. The Solr collection fields are now the columns and every row comes from a Solr entry.

Follow the same steps to use theSearch Solr (Documents)operator. After specifying thecollectionand thequeryyou can select thedocument body field.这个参数指定Solr字段将be stored in the RapidMiner document body. The other Solr fields become meta data records of the document.

Now every Solr entry is transformed into a Document instead of a row as for theSearch Solr (Data)operator.

Add to your Solr server

As for searching Solr there are two operators to add to Solr. TheAdd to Solr (Data)uploads the content of a data table to the Solr server. TheAdd to Solr (Documents)operator works similar but expects the input as a collection of documents that come from the Text extension.

We will demonstrate the configuration for theAdd to Solr (Data)operator, it can also be applied toAdd to Solr (Documents)

  1. Open a new processNew Process Iconin RapidMiner Studio, drag theAdd to Solr (Data)operator into theProcessview, and specify a connection as described above.

  2. Select a collection from the list of thecollectionparameter.

  3. Connect the input port of the operator with the data table that should be added. Every column will become a Solr field and every row a Solr entry for the respective fields.

TheAdd to Solr (Documents)operator works exactly the same just with a collection of Documents as input. The metadata records of the Documents consist of key and related value. The keys will become Solr fields and one Document will specify a Solr entry with the related values. As Documents haven an additional body, you can specify the Solr field for this via the parameterdocument body field