That Big Data problem – Thinking the Hadoop way

Posted by Fernando Doglio on March 20, 2012

What is the “big data problem”?

 

“On the night of July 9, 1958 an earthquake along the Fairweather Fault in the Alaska Panhandle loosened about 40 million cubic yards (30.6 million cubic meters) of rock high above the northeastern shore of Lituya Bay. This mass of rock plunged from an altitude of approximately 3000 feet (914 meters) down into the waters of Gilbert Inlet (see map below). The impact generated a local tsunami that crashed against the southwest shoreline of Gilbert Inlet. The wave hit with such power that it swept completely over the spur of land that separates Gilbert Inlet from the main body of Lituya Bay. The wave then continued down the entire length of Lituya Bay, over La Chaussee Spit and into the Gulf of Alaska. The force of the wave removed all trees and vegetation from elevations as high as 1720 feet (524 meters) above sea level. Millions of trees were uprooted and swept away by the wave. This is the highest wave that has ever been known.“ (quoted from http://geology.com/records/biggest-tsunami.shtml)

Now lets use our imagination a bit, and pretend we’re on a digital world, and that an even bigger wave can be seen on the horizon, only that the wave is made up of 1’s and 0’s. That’s the current status of information on the net right now.

A huge wave of data is being generated every second, ranging from user generated information such as tweets, status updates, uploaded pictures, blog posts, comments, text messages, e-mails and so on to machine generated data, like server access logs, error logs, transaction logs, etc.

And that’s not even the problem, the problem is that we need to start thinking in terms of TB or even PB of information, billions of rows instead of millions of them in order to be able to handle this big wave that’s coming.

Continue reading…

Interesting Ruby’s websites for beginners – cool initiatives

Posted by Mariana De Carli on January 26, 2012

The coolest thing about Ruby is that even though it’s a dynamic and reflective programming language it’s very easy to learn. More and more programmers around the world are interested in learning this new tool for making cool things. We’ve selected 3 interesting websites to help people (kids or adults) to know a little bit more of Ruby’s world.

Kids Ruby (to kids)

 

 

http://kidsruby.com: Kids Ruby is especially focused on kids, with a very easy interface which allows you to see the code, run it, and at the same time see what it outputs. Kids Ruby is also attractive for his Turtle graphics that allows you to draw pictures and have fun by mixing and trying colors. Kids Ruby includes a lot of useful resources and you don’t even need an internet connection to work. Developers also created a complete KidsRuby operating system based on Ubuntu Linux that makes program in Ruby a lot easier for kids.

Rails for zombies

 

http://railsforzombies.org/: Rails for Zombies offers an open-source web framework with all the power of the Ruby language and with no additional configuration needed. In this site you can see tutorial videos which allow you to learn more about Ruby on Rails in just five levels. After seeing each video you’ll be challenged with cool exercises to practice your new skills. So if you’re a zombie and you’re hungry for Ruby’s knowledge this is the perfect site for you.

Try Ruby

http://tryruby.org/: This website brings a very interactive Ruby tutorial; you can test new functions step by step and understand a little bit more about this language. In just 15 minutes and with a very interactive interface you can understand what Ruby is about. This site also allows you to save your progress by sign up for free at Code School.

Now that you have these very easy options to learn Ruby why don’t you try it out and maybe we’ll see you soon as a new member of Moove-IT’s team ;)

Digital Blackout against SOPA – PIPA

Posted by Mariana De Carli on January 18, 2012

The largest online protest in the history of Internet is taking place today, more than one hundred sites, including the popular Wikipedia, Google and WordPress confirmed their participation in this digital blackout against the new anti-piracy laws of the USA.

Stop Online Piracy Act (SOPA) and Protect IP Act (PIPA) will be voted on Jan 24th by the Congress in attempt to pass internet censorship in the Senate. These two laws are probably the most rejected ones by Americans citizen because some of them consider that they affect the most appreciated thing on internet, freedom.

The SOPA law attempt to close any foreign site which sells or shares pirated content from the USA, including music, films, books and every product non authorized for free distribution on the internet. The PIPA law meanwhile has its focus directly on protect Intellectual Property Act, avoiding any economic threats and thefts to creativity.

Anyway, these laws have great support from big industries like National Cable & Telecommunications Association, the National Association of Theatre Owners, Viacom, Copyright Alliance and NBC Universal, which argue that their businesses are dramatically affected by online piracy.

We will have to wait until next January 24 to see if the public opinion will have a direct influence on the fate of these laws.

Wikipedia.org – Home page Jan 18th

Google.com – Home page Jan 18th

WordPress.com – Home page Jan 18th

Meet Moove-iT’s UX Group 1

Posted by sebastian.suttner on November 24, 2011

What do users really need? That’s the question any software developer should ask themselves.

Here at moove-it, we always put ourselves in our clients shoes to understand their needs and give them exactly what they are looking for. In order to do so we’ve created the UX Department.

From the moment we started gathering up to discuss latest design patterns and the top UX tendencies, we knew something great would come out of it, and so it did. We managed to nurture the whole team with what we’ve learned, improve existing products and enhance new projects’s design from scratch.
We’ll keep working as hard as possible on UX, not only because of how thrilled we’ve got with the results, but also because the way the users feel the product is what matters most.

We present partial conclusions found by the UX group. We share the presentation (in spanish)

It’s about timing baby!

Posted by Andreas Fast on November 15, 2011

Yeah, it’s about timing.

There was a problem in one of our projects at moove-it related to slow processing. There is a daemon spawning new threads to process certain new entries to the database. The entries come from a different system, that’s the reason for this program which processes each new entry. Sometimes at certain hours of a day there are peaks in the entries to the database and the process will fall behind by about 20.000 entries or more. So we started analyzing the code to understand what was happening and why it took so long. We noted that each new thread the daemon spawned took about 5 seconds to complete its task. As we narrowed the measurement we came up with some code that took 5 seconds to execute but it only involved access to the database. So thanks to Aaron Patterson’s (@tenderlove) talk at RubyConf Uruguay about “Who makes the best asado” where he talked about rails and how it manages threads and database connections, we knew where to look.

What he explained is that each new thread requests its own database connection from the connection pool, and if there isn’t a free connection, rails waits for about 5 seconds and if after that there is no free connection it iterates over all the threads to take back the connections of dead threads(more info). See the correlation with the 5 seconds I talked about in the previous paragraph? We immediately suspected that this was the problem. So we started searching the Rails API for a way to release the connection at the end of each thread’s execution. Surprisingly, we didn’t find an easy and understandable explanation anywhere at first googling ;) , so we digged deeper and came up with the following line:

ActiveRecord::Base.connection_handler.clear_active_connections!

The ActiveRecord::Base.connection_handler method returns the connection handler for the current thread and the clear_active_connections method does what it looks like, or from the Rails doc: “Returns any connections in use by the current thread back to the pool, and also returns connections to the pool cached by threads that are no longer alive.”

So this line returns the connections in use by a thread to the pool and enables the new threads spawned by the daemon to use the freed connections. This way we avoid the 5 second wait for rails to free the connections for us.

This one line picked up our performance from processing 1.000 entries in almost 2 hours to processing 10.000 in 5 minutes. Nice huh?!

That’s it. I’m not sure if this is the best way of doing it since this method also “… returns connections to the pool cached by threads that are no longer alive.” I guess this means it does the iteration over all the threads Aaron mentioned, but as you can see I’m happy with the performance improvement. We are using Rails 3.0.5, Aaron said that he will change the behavior, read more about it here.

Special thanks to @cheloeloelo who helped detecting the problems and digging through the Rails API finding the proper method to free the connections.

Image: Suat Eman / FreeDigitalPhotos.net

We run Montevideo 2011 – 10K Nike competition

Posted by Ariel Ludueña on November 08, 2011

Last Saturday a group of brave Moovetians decided to accept the challenge and run the Nike 10K competition.

Nike 10K consist in running 10 kms along the coastline through some Montevideo’s neighborhoods enjoying the beautiful landscape.

Take a look at the pictures.

The whole team 

Silvana, Martin and Nicolas

Ariel before breaking the finish ribbon  :-P

Bird´s eye view

Dart – A new language for structured web programming

Posted by Andreas Fast on November 02, 2011

On October 10th Lars Bak & Gilad Bracha presented a technology preview on Dart. Lars Bak & Gilad Bracha are Google employees leading the development of Dart. Dart is open source, so anyone can use and change it. It’s still in the early stages but the design goals are very clear. It aims to be a structured yet flexible programming language for the web. To feel familiar and be easy to learn, focus on high performance and fast startup. To be appropriate for all devices from phones and tablets to notebooks and servers. There is also a lot of work being done on tools for Dart to run fast on all major modern browsers. It runs on a Virtual Machine on the server and there is a tool to compile the code to javascript to run it on a browser. It also provides a DOM api.
The following presentation shows the basics of the language including some examples. In addition, here are some photos of the presentation at moove-iT!

 

Second RubyConf in Uruguay – 11th and 12th November

Posted by Gabriela Isnardi on November 01, 2011

We are sponsoring one of the greatest technology events here in Uruguay. The Second RubyConf taking place within less than two weeks, the 11th and 12th November 2011, where many IT experts from all over the world get together in order to be immersed in this dynamic world and and up to date get with the latest trends of Ruby and Agile methodologies.

RubyConf Uruguay 2011

We are hungry for knowledge and refreshment, and we all want to be on the same train.

Please welcome all the new members to this awesome community. And help spreading the news, but even more important, do not miss the opportunity to meet the experts, discuss the future of RoR, and be Rail!

 

You can’t help but Mooving… (percussion workshop)

Posted by Gabriela Isnardi on October 17, 2011

Our main tradition, heritage and passion: The beat of the “Uruguayan” drums.

It doesn’t matter if you listen to the drums once a month, everyday, or once a year.  When it comes to local music, there is nothing more Uruguayan than the sound produced by rhythmically striking a drum, and especially when playing Candombe, a unique way of percussion, which will make your hair stand on its end.

Team building activities can range from treasure hunts to Safari trips, though this time, we have decided to do one which people could easily identify with, and which does not require sophisticated skills, but the desire to unwind, switch off and connect. Drumming workshops come first, the bonding is just a consequence.

Last week we had our first percussion workshop. Pablo Leites, an excellent musician and percussionist, also known as “Gancho” has been our instructor.  He has also been Martin Cabrera’s (Moove-IT cofounder) best friend for a long time.

Please have a look at the following pictures…

Why percussion

Why is percussion so important to us? Being a country of immigrants, Uruguay was formed by people from all over the world. And like it always happens, music has played a tremendous positive role in bringing people together and creating stronger, more significant bonds. We connect with our basic instincts, we forget about language barriers, cultural differences, rank, and we just let ourselves feel and relax.  Just listen to a few strikes and you will feel multicolor, ageless and energized.

You may have heard of Las Llamadas (The Callings), a popular annual event during Carnival here in Uruguay, which gathers thousands of people from all over the world. The drums are the main stars, and the African music roots brought by the people once made slaves in this country (and happily freed more than 150 years ago) are now our truly genuine and local music.

I personally love this rhythm, and even though I am not a music expert I will recognize its pace wherever I go. I am not sure if it is the adrenaline than runs through your body, or the inseparable link to human nature, but percussion makes your body shake, like toddlers instinctively struggling to move their bodies to the rhythm of the music.

Team Building !!

Alternatives to full text queries (part II)

Posted by Fernando Doglio on October 12, 2011

For part one, click here

What do they have in common and what makes them different?

Even though it’s hard to come up with a comparison table between all four alternatives, mainly because I can’t claim to have personal experience with all of them, the Internet has a lot of information on the subject, so I went ahead and did a bit of research on the matter.

Another point of interest to consider is that though on the long run, all four solutions provide very similar services; they do it a bit differently, since they can be categorized into two places:

  • Full text search servers: They provide a finished solution, ready for the developers to install and interact with. You don’t have to integrate them into your application; you only have to interact with them. In here we have Solr and Sphinx.
  • Full text search APIs: They provide the functionalities needed by the developer, but at a lower level. You’ll need to integrate these APIs into your application, instead of just consuming it’s services through a standard interface (like what happens with the servers). In here, we have the Lucene Project and the Xapian project.

Taking all of this into account, we can now proceed into a more in-depth discussion about our options:

Continue reading…