Include Elastica in your project as svn:externals PDF Print E-mail
Written by Nicolas Ruflin   
Wednesday, 21 December 2011 15:10

As most of you know, Elastica is hosted on github, which means it uses git as its revision control system. I have several projects which include Elastica but use subversion as its version control system. Until now, I included Elastica as an external svn source by hosting my own Elastica svn repository. But yesterday I discovered that the code from github can also be checked out through svn. I immediately asked google to get more details about this feature and discovered several blog entries on the github blog which I had somehow missed.

It is not only possible to check out repositories, but also to check out some specific subfolders or tags and you can even commit to the repository (which I didn't test). As in my projects I only use the Elastica library folder and don't need all the tests and additional data, I check out only the lib folder. If you want to check out the Elastica lib folder from version v0.18.6.0, use the following line of code:

svn co https://github.com/ruflin/Elastica/tags/v0.18.6.0/lib/ .

If you have a lib folder in your project with all your frameworks and libraries and you want to add Elastica as an external source (which is quite useful), you can set the svn:externals property on your library folder to the following.

https://github.com/ruflin/Elastica/tags/v0.18.6.0/lib/Elastica Elastica

If you already have other sources added as externals to your repository (for example ZF), just add this line below your existing lines. The next time you will update your repository, the Elastica folder with all its files will be checked out. To update to one of the next versions of Elastica, update the version number in the url in your svn:externals properties.

 
Using Elastica with multiple Elasticsearch Nodes PDF Print E-mail
Written by Nicolas Ruflin   
Monday, 21 November 2011 17:30

Elasticsearch was built with the cloud / multiple distributed servers in mind. It is quite easy to start a elasticsearch cluster simply by starting multiple instances of elasticsearch on one server or on multiple servers. Every elasticsearch instance is called a node. To start multiple instances of elasticsearch on your local machine, just run the following command in the elasticsearch folder twice:

./bin/elasticsearch -f
./bin/elasticsearch -f

As you will see, the first node will be started on port 9200, the second instance on port 9201. Elasticsearch automatically discovers the other node and creates a cluster. Elastica can be used to retrieve all node and cluster information. In the following example first the cluster object is retrieved (Elastica_Cluster) from the client and then the cluster state is read out. Then all cluster nodes (Elastica_Node) are retrieved and the name of every node is printed out. Every cluster has at least one node and every node has a specific name.

$client = new Elastica_Client();

// Retrieve a Elastica_Cluster object
$cluster = $client->getCluster();

// Returns the cluster state
$state = $cluster->getState();

// Gets all cluster notes
$nodes = $cluster->getNodes();

foreach ($nodes as $node) {
    echo $node->getName();
}

Client to multiple servers

As elasticsearch is a distributed search engine that can be run on multiple servers, it is possible that some servers fail and still, the search works as expected as the data is stored redundantly (replicas). The number of shards and replicas can be chosen for every single index during creation. Of course, this can also be set with Elastica through the mapping as can be seen in the Elastica_Index test. More details on this perhaps in a later blog post.

One of the goals of the distributed search index is availability. If one server goes down, search results should still be served. But if the client connects to only the server that just went down, no results are returned anymore. Because of this, Elastica_Client supports multiple servers which are accessed in a round robin algorithm. This is the only and also most basic option at the moment. So if we start a node on port 9200 and port 9201 above, we pass the following arguments to Elastica_Client to access both servers.

$client = new Elastica_Client(array(
	'servers' => array(
		array('host' => 'localhost', 'port' => 9200)
		array('host' => 'localhost', 'port' => 9201)
	)
));

From now on, every request is sent to one of these servers in a round robin type. Instead of localhost, an external server could be used in addition. I'm aware that this is still a quite basic implementation. As probably some of you already realized, this is no safe failover method, as every second request still goes onto the server that is down. One idea here is to give a specific threshold for every server in which the respond time should be and otherwise the query goes to the next server. In addition, it would be useful to store this information on unavailable servers somewhere in order to use it for the next request. Thus, only one client has to wait for the unavailable server. Storing this information is somehow an issue, since Elastica does not have any storage backend.

Load Distribution

This client implementation also allows to distribute the load on multiple nodes. As far as I know, Elasticsearch already does this quite well on its own. But it helps if more than one node can answer http requests. Therefore, the method above is really useful if you use more than one elasticsearch node in a cluster to send your request to all servers.

It is planned to enhance this multiple server implementation in the future with additional parameters such as priority for a server and some other ideas. Please feel free to write down your ideas in the comment section or directly create a pull request on github.

 
How to Log Requests in Elastica PDF Print E-mail
Written by Nicolas Ruflin   
Sunday, 20 November 2011 20:50

In the Elastica Release v0.18.4.1, the capability to log requests was added. There is a general Elastica_Log object that can later also be extended to log other things such as responses, exceptions and more. The Elastica_Log constructor takes an Elastica_Client as param. To enable logging, the config variable log for the client has to be set to true, or a specific path the log should be written to. This means that every client instance decides on its own whether logging is enabled or not.

The example below will log the message "hello world" to the general PHP log.

$client = new Elastica_Client(array('log' => true));
$log = new Elastica_Log($client);
$log->log('hello world');

If a file path is set as the log config param, the error log will write the "hello world" message to the /tmp/php.log file.

$client = new Elastica_Client(array('log' => '/tmp/php.log'));
$log = new Elastica_Log($client);
$log->log('hello world');

If logging is enabled, all request are at the moment automatically logged. There is a special conversion of request to log messages. The log message is converted to the shell format, so every log line can directly be pasted into the shell to test out. This is quite nice to debug and to create a gist if others ask what the query looks like. Furthermore, this makes it simpler to figure out whether the problem relates to Elastica or not.

For example the output for updating the number of replicas setting request for the index test would look like below.

curl -XPUT http://localhost:9200/test/_settings -d '{"index":{"number_of_replicas":0}}'
 
Vps.net - You get what you pay for PDF Print E-mail
Written by Nicolas Ruflin   
Saturday, 29 October 2011 18:50

I have been a vps.net user for more than two years. A few days ago vps.net published a blog post on offering a free node for every user that writes a blog post on their service. So here we are.

The most important message first: With vps.net, you get what you pay for. This can be seen in a positive, but also a negative way. Vps.net is one of the cheaper VPS hosts (as long as you don’t use lots of nodes), but at the same time it is one of the less stable ones. The performance is good, but they have quite some downtime. So if you are looking for a cheap server with good speed, vps.net is not such a bad choice, but if you plan on running a production server or other things that should be up 24/7, I would recommend looking somewhere else (where you will pay more).

An alternative to vps.net that I have been wanting to try out for a long time now, is Linode. In contrast to vps.net they increased the amount of RAM you get for a single node in the last two years. The 376MB on vps.net are really not that much.

For everyone who wants to hear more details, I had the following story with vps.net in the last two years.

Scaling - Up (sometimes) but not down

About two years ago we started to host useKit.com on vps.net, as we wanted it to scale as soon as we had more traffic on the site. But there were several issues with what we had in mind. Scaling up was easy (most of the time), but scaling down took forever and every time we wanted to scale down, we had to shut down our servers (I don’t know if that is better now) for quite a long time, which was not really a favourable to our business. Also, we were so unlucky that vps.net had some downtime every time we had the most traffic on our servers, which was quite bad for our users (and for us).

Sometimes, when we had peak traffic on our servers, one server just stopped working from one second to the other. In the log it looked like someone pulled the plug from the server. During more than 3 months, I discussed this problem forth and back with the vps.net support. Until the end it was unclear what the real problem was, but vps.net always suggested it is probably related to the setup, even though it runs on others servers without any issues. Because of this (and other things mentioned above) we decided to move all our production servers to another host, were all these problems did not appear again.

As we are using quite a few so called NoSQL database systems like MongoDB, Redis and elasticsearch, we need servers that can handle quite a high number of open files. In general this is not a problem when Linux is configured correctly. But as the issue above always happened when one of these servers had a small peak, I assumed it was related to this issue.

Support - Solving issues but not resolving problems

Because of the issues we had, I had a lot of contact with the vps.net support. First the good thing: They reply quite fast and can fix the issue most of the time. The bad thing is, if you have a real problem, the support can be quite frustrating. If you open a ticket that goes forth and back you will have contact to several different supporters. Sometimes after having discussed a subject forth and back, the supporter changed and the whole thing started from the beginning. The same questions were asked again. I somehow got the feeling that sometimes they were too lazy to read up what was discussed before.

But the main problem for me was that they were good at fixing issues, but they were not good at all at explaining how the issue was resolved or what the issue was. That was something I always asked but never got a good answer. If I want to run my production servers on a host, I want to know in more detail what kinds of issues he has and how he resolves the issue. I want to know whether my issue was something unexpected that happened and is fixed now and should not happen again or whether it is an open bug.

One thing that describes this quite good is an issue that I had about ten times in the last two years: My server stats in the vps.net admin panel disappeared. Sometimes they just stopped working even though my server was still running and my own stats (Cacti) showed the right results. Every time I contacted the vps.net support, the issue was fixed in a few minutes, but they never explained to me why this happened again and again. I felt like they were just restarting the stats service every time they got this bug report but never tried to figure out why it happened.

There are a few really good support engineers at vps.net, mostly L3 (or higher?). But it is soooo hard to get trough to them. The best answer I got from a L1 supporter for the problem above (as no stats were shown) was that this is not an issue as my server has 0 traffic and 0 cpu usage and this is the reason that nothing is shown (really?).

A small side story just happened a week ago. Vps.net planned on moving my node to a new cluster. They sent me an e-mail three days before they planned to move my node and informed me that my server will be down for 2-3h. If it were a production server, this would not be acceptable for me. This is a too short notice and the time was during the most traffic on the server. Imagine all your nodes were be on the same cluster ... Even worse, I had some downtime, but they failed to move the server and scheduled it again for next week? Apparently they had some issues but never mentioned what the issues were ...

At the moment, I still have a small development server at vps.net to run continuous integration tests for some open source projects such as Elastica. But all my other servers that have to be up 24/7 I moved to other hosters. What is nice about vps.net is that it is really simple to just start up a single additional node for one day to make some tests for only $1.

Conclusion

Vps.net is nice for some cheap test nodes, but not for production servers.

 
Storing and Analyzing Social Data PDF Print E-mail
Written by Nicolas Ruflin   
Sunday, 17 April 2011 22:04

Von Juli 2010 bis Dezember 2010 habe ich an meiner Master Arbeit geschrieben. Der Titel der Masterarbeit lautet "Storing and Analyzing Social Data". Ich habe mich mit der Struktur von Social Data, die Möglichkeiten wie man Social Data speichern kann (z.B. NoSQL Solutions) und dem Processing von Social Data beschäftigt. Wer an diesen Themen interessiert ist, kann die Arbeit hier herunterlanden.

Für eine Kurzübersicht hier das Abstract der Arbeit und unten das PDF direkt im Browser eingebunden.

Abstract

Social platforms such as Facebook and Twitter have been growing exponentially in the last few years. As a result of this growth, the amount of social data increased enormously. The need for storing and analyzing social data became crucial. New storage solutions – also called NoSQL – were therefore created to fulfill this need. This thesis will analyze the structure of social data and give an overview of cur- rently used storage systems and their respective advantages and disadvantages for differently structured social data. Thus, the main goal of this thesis is to find out the structure of social data and to identify which types of storage systems are suit- able for storing and processing social data. Based on concrete implementations of the different storage systems it is analyzed which solutions fit which type of data and how the data can be processed and analyzed in the respective system. A focus lies on simple analyzing methods such as the degree centrality and simplified PageRank calculations.

 
Joomla DokuWiki Bridge PDF Print E-mail
Written by Nicolas Ruflin   
Saturday, 16 April 2011 22:30

Viele Besucher meiner Seite sind auf der Suche nach der DokuWiki Bridge. Da ich meine Joomla Version updated habe, funktionieren einige Direktlinks zu den Beiträgen nicht mehr. Daher habe ich die Beiträge hier nochmals zusammengestellt:

Bitte auch die Kommentare beachten. Darin hat es jeweils auch sehr nützliche Hinweise.

 
Joomla Update Odyssee von 1.0 zu 1.5 PDF Print E-mail
Written by Nicolas Ruflin   
Wednesday, 13 April 2011 22:08

Auf meinem Blog herrschte nun für mehr als 1 Jahr Funkstelle. Dafür gibt es diverse Gründe. Einer der Hauptgründe aber ist wohl, dass ich schon seit mehr als einem Jahr die Joomla Migration von 1.0 zu 1.5 machen wollte. Dies war leider eine längere Odysee. Daher hier kurz eine Übersicht dazu mit einigen zusätzlichen Tipps zur Migration.

Grundsätzlich wollte ich 4 Dinge migrieren:

  • Blog Einträge
  • Blog Comments
  • Image Gallery (JoomGallery)
  • Url's

Nach mehreren Fehlversuchen musste ich feststellen, dass ich die URLs wohl vergessen kann. Ich habe irgend ein altes Zusatzplugin verwendet, das es in 1.5 gar nicht mehr gibt. Das heisst, nun konnte ich mich auf die restlichen drei konzentrieren. Um die Migration zu testen hatte ich mir eine lokale Kopie der MySQL DB und aller files erstellt, welche bei mir unter ruflin.dev lief.

Zu meinem Erstaunen war die JoomGallery das einfachste. Ich installierte eine komplett neue Version von Joomla 1.5, kopierte alle JoomGallery files ins Verzeichnis von 1.5, kopierte die entsprechenden DB Tabellen in die neue DB und installierte die neue JoomGallery in 1.5. Die Migration wurde automatisch gemacht und klappte ohne Probleme.

Ich probierte den Migration Assistent zur Migration von 1.0 zu 1.5. Leider scheiterte ich aber mehrfach. Anscheinend habe ich über die letzten Jahren zu viele Hacks in meine Joomla Version installiert. Bei der Migration hatte ich Massenhaft Fehler, obwohl ich die Struktur von diversen Tabellen angepasst hatte. Schlussendlich entschied ich mich für eine manuelle Migration. Ich schrieb ein kleines PHP Skript welches allen Content in die neue DB kopiert. Das klappte, allerdings stimmte anschliessend natürlich das Encoding nicht. Ich hatte schon früher Probleme damit, da ich irgendwann mal meine Tabellen auf utf8 umgestellt hatte. Um den Content sauber zu konvertieren habe ich nun noch folgendes zusätzliches Skript geschrieben:

mysql_connect('localhost', 'name', 'pw');
mysql_select_db('db');
mysql_set_charset('utf8');

$result = mysql_query('SELECT * FROM jos_jcomments');

while ($row = mysql_fetch_assoc($result)) {
	
	if (strpos($row['title'], 'Ã') !== false) {
    	$title = utf8_decode($row['title']);
	} else {
		$title = $row['title'];
	}

	$query = "UPDATE  `jos_jcomments` SET  
		`title` =  '" . mysql_escape_string($title) . "' ,
		WHERE  `jos_jcomments`.`cid` = " . $row['cid'];
		
				
	$res = mysql_query($query);

}

Das Skript dekodiert alle strings in welchen es das Zeichen Ã findet und Ev. ist das ein ganz spezifisches Problem das nur bei mir aufgetreten ist. Aber ich dachte ich poste das Skript, falls es jemandem sonst noch helfen könnte. Im Beispiel oben konvertiere ich alle comment title, das gleiche habe ich aber auch für alle content items (title, content, ...) gemacht. Dazu müssten einfach die Parameter angepasst werden.

Die Migration der Kommentare war glücklicherweise relativ einfach, da mein neues Kommentarplugin JComments für eine ziemlich grosse Anzahl an alten Components Importmöglichkeiten bietet.

Nun ist meine Joomla Version wieder up-to-date und ich hoffe in Zukunft wieder mehr Blog Einträge zu schreiben. Leider habe ich aber alle "direkten" Links zu den alten Einträgen verloren. Ich hoffe Google wird möglichst bald meine Seite neu indizieren, damit die alten Einträge wieder unter der richtigen Adresse gefunden werdne. Mal schauen wie lange ich mich vor der Migration auf 1.6 drücken kann, und ob dies auch wieder eine solch grosse Hürde wird. Grundsätzlich werde ich wohl an der default Installation nicht mehr zu viel herumschrauben, da man sich dadurch ein "einfaches" Update fast verunmöglicht. Allen welche die Umstellung noch nicht gemacht haben, viel Glück.

 
Saisonabschluss auswärts gegen den TV Frick PDF Print E-mail
Written by Roland Lüpold   
Wednesday, 24 March 2010 17:02

Am kommenden Samstag spielt das Herren 1 der HSG Siggenthal/Vom Stein Baden das letzte Saisonspiel auswärts gegen den TV Frick

 
<< Start < Prev 1 2 3 4 5 6 7 8 9 Next > End >>

Page 1 of 9
 
JOOMLA TEMPLATES Joomla Templates By JoomlaBear