Friday, 22 July 2016

How To Install Solr 5.2.1 on Ubuntu 14.04

Hi everybody,

Today we will discuss about how to install apache solr(5.2.1) on Ubuntu 14.04

Solr is a search engine platform based on Apache Lucene. It is written in Java and uses the Lucene library to implement indexing. It can be accessed using a variety of REST APIs, including XML and JSON. This is the feature list from their website:

    Advanced Full-Text Search Capabilities

    Optimized for High Volume Web Traffic

    Standards Based Open Interfaces - XML, JSON and HTTP

    Comprehensive HTML Administration Interfaces

    Server statistics exposed over JMX for monitoring

    Linearly scalable, auto index replication, auto failover and recovery

    Near Real-time indexing

    Flexible and Adaptable with XML configuration

    Extensible Plugin Architecture


In this article, we will install Solr using its binary distribution.

Prerequisites


To follow this tutorial, you will need:

    One 1 GB Ubuntu 14.04 Droplet at minimum, but the amount of RAM needed depends highly on your specific situation.


    A sudo non-root user.


Step 1 — Installing Java


Solr requires Java, so in this step, we will install it.

The complete Java installation process is thoroughly described in this article, but we'll use a slightly different process.

First, use apt-get to install python-software-properties:

    sudo apt-get install python-software-properties

Instead of using the default-jdk or default-jre packages, we'll install the latest version of Java 8. To do this, add the unofficial Java installer repository:

    sudo add-apt-repository ppa:webupd8team/java
You will need to press ENTER to accept adding the repository to your index.

Then, update the source list:

    sudo apt-get update


Last, install Java 8 using apt-get. You will need to agree to the Oracle Binary Code License Agreement for the Java SE Platform Products and JavaFX.

    sudo apt-get install oracle-java8-installer

Step 2 — Installing Solr


In this section, we will install Solr 5.2.1. We will begin by downloading the Solr distribution.

First, find a suitable mirror on this page. Then, copy the link of solr-5.2.1.tgz from the mirror. For example, we'll use http://apache.mirror1.spango.com/lucene/solr/5.2.1/.

Then, download the file in your home directory:

    cd ~
    wget http://apache.mirror1.spango.com/lucene/solr/5.2.1/solr-5.2.1.tgz

Next, extract the service installation file:

    tar xzf solr-5.2.1.tgz solr-5.2.1/bin/install_solr_service.sh --strip-components=2

And install Solr as a service using the script:

    sudo bash ./install_solr_service.sh solr-5.2.1.tgz
Finally, check if the server is running:

    sudo service solr status


You should see an output that begins with this:

Solr status output


Found 1 Solr nodes:

Solr process 2750 running on port 8983

. . .

Step 3 — Creating a Collection


In this section, we will create a simple Solr collection.

Solr can have multiple collections, but for this example, we will only use one. To create a new collection, use the following command. We run it as the Solr user in this case to avoid any permissions errors.

    sudo su - solr -c "/opt/solr/bin/solr create -c gettingstarted -n data_driven_schema_configs"
In this command, gettingstarted is the name of the collection and -n specifies the configset. There are 3 config sets supplied by Solr by default; in this case, we have used one that is schemaless, which means that any field can be supplied, with any name, and the type will be guessed.

You have now added the collection and can start adding data. The default schema has only one required field: id. It has no other default fields, only dynamic fields. If you want to have a look at the schema, where everything is explained clearly, have a look at the file /opt/solr/server/solr/gettingstarted/conf/schema.xml.

Step 4 — Adding and Querying Documents


In this section, we will explore the Solr web interface and add some documents to our collection.

When you visit http://your_server_ip:8983/solr using your web browser, the Solr web interface should appear:

Solr Web Interface


The web interface contains a lot of useful information which can be used to debug any problems you encounter during use.

Collections are divided up into cores, which is why there are a lot of references to cores in the web interface. Right now, the collection gettingstarted only contains one core, named gettingstarted. At the left-hand side, the Core Selector pull down menu is visible, in which you'll be able to select gettingstarted to view more information.

After you've selected the gettingstarted core, select Documents. Documents store the real data that will be searchable by Solr. Because we have used a schemaless configuration, we can use any field. Let'sl add a single document with the following example JSON representation by copying the below into the Document(s) field:

{
    "number": 1,
    "president": "George Washington",
    "birth_year": 1732,
    "death_year": 1799,
    "took_office": "1789-04-30",
    "left_office": "1797-03-04",
    "party": "No Party"
}


Click Submit document to add the document to the index. After a few moments, you will see the following:
Output after adding Document

Status: success
Response:
{
  "responseHeader": {
    "status": 0,
    "QTime": 509
  }
}
You can add more documents, with a similar or a completely different structure, but you can also continue with just one document.

Now, select Query on the left to query the document we just added. With the default values in this screen, after clicking on Execute Query, you will see 10 documents at most, depending on how many you added:
Query output

{
  "responseHeader": {
    "status": 0,
    "QTime": 58,
    "params": {
      "q": "*:*",
      "indent": "true",
      "wt": "json",
      "_": "1436827539345"
    }
  },
  "response": {
    "numFound": 1,
    "start": 0,
    "docs": [
      {
        "number": [
          1
        ],
        "president": [
          "George Washington"
        ],
        "birth_year": [
          1732
        ],
        "death_year": [
          1799
        ],
        "took_office": [
          "1789-04-30T00:00:00Z"
        ],
        "left_office": [
          "1797-03-04T00:00:00Z"
        ],
        "party": [
          "No Party"
        ],
        "id": "1ce12ed2-add9-4c65-aeb4-a3c6efb1c5d1",
        "_version_": 1506622425947701200
      }
    ]
  }
}

Conclusion


There are many more options available, but you have now successfully installed Solr and can start using it for your own site.

Thanks.
 

Saturday, 16 July 2016

Using Solarium with SOLR and Laravel 5, for Search – Solarium and GUI

Hi Everybody,

After long time i am writing this blog, because i busy somewhere, but now come with some knowledge full thing, that will really help full

In the blog, I introduced the key concepts and we installed and set up SOLR. In this second part we’ll install Solarium, start building an example application, populate the search index and get into a position where we can start running searches.

Creating the Application


Create a new Laravel application via Composer:

composer create-project laravel/laravel movie-search --prefer-dist

Make the storage directory app/storage directory writeable.

Installing Solarium


You’ll recall from the first part that the Solarium Project provides a level of abstraction over the underlying SOLR requests which enables you to code as if SOLR were a native search implementation running locally, rather than worry about issuing HTTP requests or parsing the responses.

By far the best way to install Solarium is using Composer:

"solarium/solarium": "dev-develop"

Alternatively you can download or clone it from Github.

If you’re following along and building the movie search application, this will go in your newly created project’s composer.json file just as you would any other third-party package; there’s no Laravel service provider though, in this case.

Configuring Solarium

The constructor to the Solarium client takes an array of connection details, so create a configuration file – app/config/solr.php as follows:

return array(
    'host'      => '127.0.0.1',
    'port'      => 8983,
    'path'      => '/solr/',
):

If you’ve installed SOLR as per the instructions in the first part these defaults should be just fine, although in some circumstances you may need to change the port number.

For simplicity, we’re simply going to create an instance of the Solarium client as a property of the controller (app/controllers/HomeController.php):

    /**
     * @var The SOLR client.
     */
    protected $client;

    /**
     * Constructor
     **/
    public function __construct()
    {
        // create a client instance      
        $this->client = new \Solarium\Client(Config::get('solr'));
    }
Normally in Laravel you’d create an instance in a service provider bound to the IoC container, but this way will do fine for what’s a pretty simple application.

Ping Queries

A ping query is useful for checking that the SOLR server is running and accessible, and therefore a good place to begin. Using Solarium it’s simple, so you may wish to test if everything is working by using the following:

// create a ping query
$ping = $client->createPing();

// execute the ping query
try {
    $result = $client->ping($ping);
} catch (Solarium\Exception $e) {
    // the SOLR server is inaccessible, do something
}

As the example illustrates, an inaccessible SOLR instance throws an exception, so you can act accordingly by catching it at this point.

Sample Data

For the purposes of this tutorial we’re going to build a simple movie search. I’ve created a CSV file containing around 2,000 movies around a bunch of keywords (for example space, night and house) which you can download, or if you want to create your own, you might want to check out the Rotten Tomatoes API. (As an aside, but one which is related, IMDB make their data available but spread over a number of CSV files – some of which are enormous – and only make them available via the website or over FTP.)

Before we write a command to import this data, let’s look at the basic create, update and delete operations on the SOLR search implementation using Solarium.

Adding Documents to the Search Index
To add a document to the search index, you first need to create an update query instance:

$update = $client->createUpdate();
Then create a document:

$doc = $update->createDocument();    
Now you can treat the document ($doc) as if it were a stdClass and simply assign data as public properties:

$doc->id = 123;
$doc->title = 'Some Movie';
$doc->cast = array('Sylvester Stallone', 'Marylin Monroe', 'Macauley Culkin');
…and so on, before adding the document to the update query:

$update->addDocument($doc);
Then commit the update:

$update->addCommit();
Finally, to actually run the query you call update() on the client:

$result = $client->update($update);
If you wish to verify that you’ve successfully indexed some documents, visit the SOLR admin panel in your browser – typically http://localhost:8983/solr and click Core Admin, you’ll find the number of documents in the index listed as numDocs in the Index section.

Updating Documents

If you need to update a document in the index, you simply need to “re-add” it and – assuming it has the same unique identifier – SOLR will be smart enough to update it, rather than add a new one.

Deleting Documents

You can delete a document from the index using an update query, using syntax not too dissimilar to WHERE clauses in SQL. For example, to delete a document uniquely identified by the ID 123:

// get an update query instance
$update = $client->createUpdate();

// add the delete query and a commit command to the update query
$update->addDeleteQuery('id:123');
$update->addCommit();

// this executes the query and returns the result
$result = $client->update($update);
Or you can be less specific:

// get an update query instance
$update = $client->createUpdate();

// add the delete query and a commit command to the update query
$update->addDeleteQuery('title:test*');
$update->addCommit();

// this executes the query and returns the result
$result = $client->update($update);

Note the use of a wildcard character – in other words: “delete all documents whose title starts with test”.

Populating the Search Index with Movies

Now we’ve looked at the fundamentals of indexing documents, let’s put some data into the index for our sample application.

Laravel makes it easy to build console commands, so let’s create one to import the contents of our movie CSV file and index them. We could also create corresponding database records at this point, but for the purposes of this exercise the indexed documents will contain everything we need.

To create the command, enter the following into the command line:

php artisan command:make PopulateSearchIndexCommand
In the newly-generated file – app/commands/PopulateSearchIndexCommand.php – set the command’s name (i.e., what you enter on the command-line to run it) and the description:

/**
 * The console command name.
 *
 * @var string
 */
protected $name = 'search:populate';

/**
 * The console command description.
 *
 * @var string
 */
protected $description = 'Populates the search index with some sample movie data.';
Now we’ll use the fire() method to create the Solarium client, open the CSV, iterate through it and index each movie:

/**
 * Execute the console command.
 *
 * @return void
 */
public function fire()
{       
    $client = new \Solarium\Client(Config::get('solr'));

    // open up the CSV
    $csv_filepath = storage_path() . '/movies.csv';

    $fp = fopen($csv_filepath, 'r');

    // Now let's start importing
    while (($row = fgetcsv($fp, 1000, ";")) !== FALSE) {

        // get an update query instance
    $update = $client->createUpdate();

    // Create a document
        $doc = $update->createDocument();    

        // set the ID
    $doc->id = $row[0];

    // ..and the title
        $doc->title = $row[1];

        // The year, rating and runtime columns don't always have data
        if (strlen($row[2])) {
            $doc->year = $row[2];
        }
        if (strlen($row[3])) {
            $doc->rating = $row[3];
        }
        if (strlen($row[4])) {
            $doc->runtime = $row[4];
        }

        // set the synopsis
        $doc->synopsis = $row[5];

        // We need to create an array of cast members
        $cast = array();

        // Rows 6 through 10 contain (or don't contain) cast members' names
        for ($i = 6; $i <= 10; $i++) {
            if ((isset($row[$i])) && (strlen($row[$i]))) {
                $cast[] = $row[$i];
            }
        }

        // ...then we can assign the cast member array to the document
        $doc->cast = $cast;

        // Let's simply add and commit straight away.
        $update->addDocument($doc);
    $update->addCommit();

    // this executes the query and returns the result
    $result = $client->update($update);

    }

    fclose($fp);
}
Whilst this isn’t the slickest or most resilient piece of code, it’s a fairly academic exercise which we’re only planning to run once anyway. Ensure the CSV file is in the correct place – app/storage/movies.csv and run it:

php artisan search:populate
All being well, the index should now contain ~2,300 movies. You can check this via the admin interface.

In the next section we’ll start building the basic search in.

The Search Form
Let’s create the search form using Laravel’s Blade templating engine, along with the form builder. So, in app/views/home/index.blade.php:

@extends('layouts.default')

@section('content')

<header>
    {{ Form::open(array('method' => 'GET')) }}
    <div class="input-group">
        {{ Form::text('q', Input::get('q'), array('class' => 'form-control input-lg', 'placeholder' => 'Enter your search term')) }}        
        <span class="input-group-btn">
            {{ Form::submit('Search', array('class' => 'btn btn-primary btn-lg')) }}            
        </span>
    </div>
    {{ Form::close() }}
</header>

@endsection
A basic page layout in app/views/layouts/default.blade.php:

<!doctype html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Movie Search</title>

    <link rel="stylesheet" href="//netdna.bootstrapcdn.com/bootstrap/3.0.3/css/bootstrap.min.css">

</head>
<body>
    <div class="container">
        <header id="header" class="row">        
            <div class="col-sm-12 col-md-12 col-lg-12">
                <h1>Movie Search</h1>
            </div>      
        </header>

        <div class="row">
            <div class="col-sm-12 col-md-12 col-lg-12">
                @yield('content')
            </div>
        </div>

        <hr />

        <footer id="footer">

        </footer>

    </div>

</body>
</html>
In app/controllers/HomeController.php:

public function getIndex()
{
    return View::make('home.index');
}
Finally, replace the contents of app/routes.php with the following, to tell Laravel to use HomeController:

<?php
/*
|--------------------------------------------------------------------------
| Application Routes
|--------------------------------------------------------------------------
*/
Route::controller('/', 'HomeController');
Now we’re set up and ready to implement the basic search mechanism, which will be the subject of the next part in the series.

Thanks..