SolrCloud cluster

1 Introduction to SolrCloud

1.1 What is SolrCloud

  SolrCloud (solr cloud) is a distributed search solution provided by Solr. When you need large-scale, fault-tolerant, distributed SolrCloud is used for indexing and retrieval capabilities. SolrCloud does not need to be used when a system has a small amount of index data. When the amount of index is large and search request concurrency is high, SolrCloud is needed to meet these requirements.

  SolrCloud is a distributed search solution based on Solr and Zookeeper. Its main idea is to use Zookeeper as the configuration information center of the cluster.

  It has several features:

    1) Centralized configuration information

    2) Automatic fault tolerance

    3) Near real-time Search

    4) Automatic load balancing during query

1.2 SolrCloud system architecture

    share picture

  【1】Physical structure

     three Solr instances (each instance includes two Core) to form a SolrCloud.

  [2] Logical structure

     The index set includes two shards (shard1 and shard2), shard1 and shard2 are composed of three cores, one of which is a leader and two replications, and the leader is Elected by zookeeper, zookeeper controls the index data of the three cores on each shard to be consistent to solve the problem of high availability.

     The index request initiated by the user is obtained from shard1 and shard2 respectively to solve the high concurrency problem.

  (1)Collection

     Collection is a logically complete index structure in the SolrCloud cluster. It is often divided into one or more shards (shards), which use the same configuration information.

     For example, a collection can be created for product information search.

    collection=shard1+shard2+….+shardX

  (2) Core

Each Core is an independent operating unit in Solr, providing indexing and search services. A shard needs to be composed of one Core or multiple Cores. Since the collection is composed of multiple shards, the collection is generally composed of multiple cores.

  (3)Master or Slave

    Master is the master in the master-slave structure Node (usually referred to as the master server), Slave is the slave node (usually referred to as the slave server or the standby server) in the master-slave structure. The data stored by the master and slave under the same shard are the same, which is to achieve high availability.

   (4)Shard

The logical shard of     Collection. Each Shard is transformed into one or more replications, and which is the Leader is determined through elections.

2 Set up SolrCloud

2.1 Set up requirements

  Share pictures

  

  Zookeeper as a cluster management tool

    1, cluster management: fault tolerance, load balancing.

    2, centralized management of configuration files

    3, cluster entrance

     needs to achieve zookeeper high availability, and needs to build a zookeeper cluster. The recommendation is an odd number of nodes. Three zookeeper servers are required.

     It takes 7 servers to build a solr cluster (to build a pseudo-distributed, it is recommended that the memory of the virtual machine is more than 1G):

     requires three zookeeper nodes

     requires four A tomcat node.

2.2 Preparations

  Environment preparation

    CentOS-6.5-i386-bin-DVD1.iso

    jdk-7u72-linux-i586.tar.gz

    apache-tomcat-7.0.47.tar.gz

    zookeeper-3.4.6.tar.gz

    solr-4.10.3.tgz

  Steps:

     (1) Set up a Zookeeper cluster (we have completed it in the previous section)

p>

     (2) Upload the tomcat that has been deployed with Solr to Linux

     (3) Create the folder /usr/local/solr-cloud in linux Create 4 instances of tomcat

p>

[[emailprotected] ~]# mkdir /usr/local/solr-cloud

[[email protected]
~]# cp -r tomcat-solr /usr/local/solr-cloud/tomcat-1
[[email protected]
~]# cp -r tomcat-solr /usr/local/solr-cloud/tomcat-2
[[email protected]
~]# cp -r tomcat-solr /usr/local/solr-cloud/tomcat-3
[[email protected]
~]# cp -r tomcat-solr /usr/local/solr-cloud/tomcat-4

     (4) will be local Upload solrhome to linux

     (5) Create a folder /usr/local/solrhomes in linux and copy 4 copies of solrhome

[[ email protected] ~]# mkdir /usr/local/solrhomes

[[email protected]
~]# cp -r solrhome /usr/local/solrhomes/solrhome-1
[[email protected]
~]# cp -r solrhome /usr/local/solrhomes/solrhome-2
[[email protected]
~]# cp -r solrhome /usr/local/solrhomes/solrhome-3
[[email protected]
~]# cp -r solrhome /usr/local/solrhomes/solrhome-4

     (6) Modify each solr web .xml file, associated with solrhome (/usr/local/solr-cluster/tomcat-1/webapps/solr/WEB-INF/web.xml)

 <env-entry>

<env-entry-name>solr/homeenv-entry-name>
<env-entry-value>/usr/local/solrhomes/solrhome-1env -entry-value>
<env-entry-type>java.lang.Stringenv-entry-type< /span>>
env-entry>

  (7) Modify the original operating port of each tomcat 8005 8080 8009, respectively

    8105 8180 8109

    8205 8280 8209

    8305 8380 8309

    8405 8480 8409

  The port used to close the service of TO800

   

  

.  

    8080 port, responsible for establishing HTTP connection. This connector is used when accessing the Web application of the Tomcat server through a browser.

    8009 port is responsible for establishing connections with other HTTP servers. This connector is needed when integrating Tomcat with other HTTP servers.

2.3 Configure the cluster

  (1) Modify the catalina.sh file in the bin directory of each tomcat instance

       add this configuration to catalina.sh< /p>

      JAVA_OPTS=”-DzkHost=192.168.25.101:2181,192.168.25.101:2182,192.168.25.101:2183″

      JAVA_OPTS is used to set the JVM parameters, as the name suggests. Variables. This configuration is used to find the zookeeper cluster when tomcat starts.

   (2) Configure solrCloud related configuration. There is a solr.xml under each solrhome, and the ip and port number in it are configured (the corresponding tomcat IP and port).

     solrhomes/solrhome-1/solr.xml

 <solrcloud>

<str name="host">192.168.25.101str>
<int name="hostPort">8180int>
<str name="hostContext">${hostContext:solr}str>
<int name="zkClientTimeout">${zkClientTimeout:30000}int>
<bool name="genericCoreNodeNames">${genericCoreNodeNames:true}bool>
solrcloud>

    solrhomes/solrhome-2/solr.xml

 <solrcloud>

<str name="host">192.168.25.101str>
<int name="hostPort">8280int>
<str name="hostContext">${hostContext:solr}str>
<int name="zkClientTimeout">${zkClientTimeout:30000}int>
<bool name="genericCoreNodeNames">${genericCoreNodeNames:true}bool>
solrcloud>

    solrhomes/solrhome-3/solr.xml

 <solrcloud>

<str name="host">192.168.25.140str>
<int name="hostPort">8380int>
<str name="hostContext">${hostContext:solr}str>
<int name="zkClientTimeout">${zkClientTimeout:30000}int>
<bool name="genericCoreNodeNames">${genericCoreNodeNames:true}bool>
solrcloud>

    solrhomes/solrhome-4/solr.xml

 <solrcloud>

<str name="host">192.168.25.140str>
<int name="hostPort">8480int>
<str name="hostContext">${hostContext:solr}str>
<int name="zkClientTimeout">${zkClientTimeout:30000}int>
<bool name="genericCoreNodeNames">${genericCoreNodeNames:true}bool>
solrcloud>

   (3) Let zookeeper manage configuration files in a unified manner.

      Need to upload zkcli.sh in the solr source package

       1. Upload solr-4.10.3.tgz.tgz

                            1. solr-4.10.3.tgz.tgz

       3. Find the /usr/local/solr-4.10.3/example/scripts/cloud-scripts directory

       4. Enter the command :

./zkcli.sh -zkhost 192.168.25.101:2181,192.168.25.101:2182,192.168.25.101:2183 -cmd upconfig -confdir /usr/local/ solrhomes/solrhome-1/collection1/conf -confname myconf

      Parameter explanation

                              -  -  -     -cm

  -cm Order. upconfig is the command to upload the configuration

        -confdir: the directory where the configuration file is located

            -confname: configuration name

2.4 start cluster

1) Start each tomcat instance. Make sure that the zookeeper cluster is started.

    Enter /usr/local/solr-cluster/tomcat-1/bin Directory

     Execute: (found that the permissions are not enough: Go back to /usr/local to give solr-cluster/authorize :Execute: chmod -R 777 solr-cluster) ./startup.sh

      sequentially start all four tomcats

      (note that when copying files, it is recommended to wait a few seconds. Otherwise, the test may not be completed, the project lacks jar, or the file fails to start)  

   Input in the address bar: http://192.168.25.101:8180/solr/#/ There is a cloud table name cluster on the left Ok

3 code to connect to the cluster

   1. Annotate the previous stand-alone version and use the new one:

     provides a class called CloudSolrServer in SolrJ , It is a subclass of SolrServer, used to connect to solrCloud

     Its construction parameter is the zookeeper address list, and it requires the defaultCollection attribute (the default collection name) to be specified.

    we Now modify the configuration file of the springDataSolrDemo project, log out the original solr-server and replace it with CloudSolrServer. Specify the construction parameter as the address list, and set the default collection name

  

  <bean id ="solrServer" class="org.apache.solr.client.solrj.impl.CloudSolrServer"> 

<constructor-arg value="192.168.25.140:2181,192.168.25.140:2182,192.168.25.140:2183" />
<property name="defaultCollection" value="collection1">property>
bean>

  2. Use mavenProfile to modify the packaging type

    solr.xml modify the configuration as follows:

   

<solr:solr-server id="solrServer_dev" url="http://127.0.0.1:8080/solr" />
<bean id="solrServer_pro" class="org.apache.solr.client.solrj.impl.CloudSolrServer">
<constructor-arg value="192.168.25.101:2181,192.168.25.101:2182,192.168.25.101:2183" />
<property name="defaultCollection" value="collection1">property>
bean>

<bean id="solrTemplate" class="org.springframework.data.solr.core.SolrTemplate">
<constructor-arg ref="solrServer_${env}" />
bean>

    pom is added as follows:

  < properties>

<env >devenv>
properties>
<profiles >
<profile >
<id >devid>
<properties >
<env >devenv>
properties>
profile>
<profile >
<id >proid>
<properties >
<env >proenv>
properties>
profile>
profiles>

   Add new build in pom file:

      <resources>

<resource >
<directory >src/main/resourcesdirectory>
<filtering >truefiltering>
resource>
resources>

4 Fragmentation configuration

  1 Create a new Collection for fragmentation processing.

     enter the following address in the browser to create a new Collection according to our requirements

http://192.168.25.101:8180/solr/admin/collections?action=CREATE&name=collection2&numShards=2&replicationFactor=2

< /div>

    Parameter:

      name: The name of the set to be created
      numShards : The number of logical fragments that need to be created when the collection is created
      replicationFactor: The number of replicas of the fragment.

       see this prompt to indicate success

  share picture

   2 Delete unused Collection. Execute the following command

http:// 192.168.25.101:8480/solr/admin/collections?action=DELETE&name=collection1

[[emailprotected] ~]# mkdir /usr /local/solr-cloud

[[email protected]
~]# cp -r tomcat-solr /usr/local/solr-cloud/tomcat-1
[[email protected]
~]# cp -r tomcat-solr /usr/local/solr-cloud/tomcat-2
[[email protected]
~]# cp -r tomcat-solr /usr/local/solr-cloud/tomcat-3
[[email protected]
~]# cp -r tomcat-solr /usr/local/solr-cloud/tomcat-4

[[ email protected] ~]# mkdir /usr/local/solrhomes

[[email protected]
~]# cp -r solrhome /usr/local/solrhomes/solrhome-1
[[email protected]
~]# cp -r solrhome /usr/local/solrhomes/solrhome-2
[[email protected]
~]# cp -r solrhome /usr/local/solrhomes/solrhome-3
[[email protected]
~]# cp -r solrhome /usr/local/solrhomes/solrhome-4

 <env-entry>

<env-entry-name>solr/homeenv-entry-name>
<env-entry-value>/usr/local/solrhomes/solrhome-1env -entry-value>
<env-entry-type>java.lang.Stringenv-entry-type< /span>>
env-entry>

 <solrcloud>

<str name="host">192.168.25.101str>
<int name="hostPort">8180int>
<str name="hostContext">${hostContext:solr}str>
<int name="zkClientTimeout">${zkClientTimeout:30000}int>
<bool name="genericCoreNodeNames">${genericCoreNodeNames:true}bool>
solrcloud>

  <solrcloud>

<str name="host">192.168.25.101str>
<int name="hostPort">8280int>
<str name="hostContext">${hostContext:solr}str>
<int name="zkClientTimeout">${zkClientTimeout:30000}int>
<bool name="genericCoreNodeNames">${genericCoreNodeNames:true}bool>
solrcloud>

 <solrcloud>

<str name="host">192.168.25.140str>
<int name="hostPort">8380int>
<str name="hostContext">${hostContext:solr}str>
<int name="zkClientTimeout">${zkClientTimeout:30000}int>
<bool name="genericCoreNodeNames">${genericCoreNodeNames:true}bool>
solrcloud>

 <solrcloud>

<str name="host">192.168.25.140str>
<int name="hostPort">8480int>
<str name="hostContext">${hostContext:solr}str>
<int name="zkClientTimeout">${zkClientTimeout:30000}int>
<bool name="genericCoreNodeNames">${genericCoreNodeNames:true}bool>
solrcloud>

./zkcli.sh -zkhost 192.168.25.101:2181,192.168.25.101:2182,192.168.25.101:2183 -cmd upconfig -confdir /usr/local/solrhomes/solrhome-1/collection1/conf -confname myconf

  <bean id="solrServer" class="org.apache.solr.client.solrj.impl.CloudSolrServer">

<constructor-arg value="192.168.25.140:2181,192.168.25.140:2182,192.168.25.140:2183" />
<property name="defaultCollection" value="collection1">property>
bean>

   

<solr:solr-server id="solrServer_dev" url="http://127.0.0.1:8080/solr" />
<bean id="solrServer_pro" class="org.apache.solr.client.solrj.impl.CloudSolrServer">
<constructor-arg value="192.168.25.101:2181,192.168.25.101:2182,192.168.25.101:2183" />
<property name="defaultCollection" value="collection1">property>
bean>

<bean id="solrTemplate" class="org.springframework.data.solr.core.SolrTemplate">
<constructor-arg ref="solrServer_${env}" />
bean>

  <properties>

<env>devenv>
properties>
<profiles>
<profile>
<id>devid>
<properties>
<env>devenv>
properties>
profile>
<profile>
<id>proid>
<properties>
<env>proenv>
properties>
profile>
profiles>

      <resources>

<resource>
<directory>src/main/resourcesdirectory>
<filtering>truefiltering>
resource>
resources>

http://192.168.25.101:8180/solr/admin/collections?action=CREATE&name=collection2&numShards=2&replicationFactor=2

http://192.168.25.101:8480/solr/admin/collections?action=DELETE&name=collection1

Leave a Comment

Your email address will not be published.