Abstract
Once in a while I wrote down stuff that was asked lots of times via mail. So first I created a faq.txt, but soon I realized that in the end there was just a need for more information. More information on how that thing works. Read on to find out.
Where do you get your data from?
Well, the data is all around: it's on the radio, on google, in newsstreams, on television, facebook and twitter: everywhere! Everything I do is grab that data, consolidate and structure it and last but not least prove and publish it.
That in fact is the core capability of the API.
'Okay', you might be saying, 'but I'm hanging on that Event-API and cannot understand why and why now (and not already or even earlier) this event was fired.
Granted! There are several ways I get and gather data from. Let's get into some more detail.
One is input by hand. Really. At least at times and at the very least I do some rough quality assurance each and every day. Yes that's some kind of an effort but somehow it became my personal claim (or 'bad' habit) to deliver only correct and proved data as I was using so many APIs that delivered wrongish data. I want to provide a better API.
Over time I wrote several bots that harvest the web for information. These bots are clinched to facebook, rss feeds or anywhere. Some do not work any more, some do. Recently I coded a new bot that looks through the Twitter-stream and it's the fastest and most sophisticated one until now.
However. The centerpiece is not the bots but an Elasticsearch-backed receiver you can throw anything at and it parses the desired information. Say, for example sending a string like #FCBSEV 1-0 to that receiver, it recognizes that it must be either FC Bayern Munich or FC Barcelona, but for sure Sevilla as away-team playing. So that's a lot of information already and if that information comes in like 3 times there can't be too many options of games to choose from, so updating the score is not too difficult any more. That's how it works basically. Currently the settings for that receiver are still conservative, so I only catch like 80% of all games with near-realtime scores. That's why some games are updated faster than others.
Technologies and architecture
When some day I realized it's quite some stuff working together and I was losing track, I decided to collect everything and draw some diagram to visualize the components. As work is in progress some parts are not up-to-date any more, but more or less that is the architecture of the API exposed.
Java Platform.
Yes, it's a monster. But in the end, my compiled war-files work, they just work stable and I can rely on it. I feel good in having such a mature platform as a solid base.
Grails
is the core. Someday I split the entire codebase into two projects like backend and frontend and the stuff they share I moved to a plugin that is used by both. I think Grails is great and Groovy is great. So much neat features and so much productivity.
RabbitMQ
is not less core. It enables me to decouple and orchestrate things.
mySQL
is still there, yep. It works and is solid.
Elasticsearch
definately rocks I gather information which 10 years ago I wondered if it's even possible to get.
Redis
works and has it's place. But I'm refactoring it's current usage into native Go at the moment. Not sure if it will last.
Hekad
is like a only a small helper/utility, but I must mention it explicitly because it's awesome. If you need to pass information around different systems, have a look at it.
If you have strong experience in one of these fields, feel free to contact me for an open conversation. Everything is working just fine, but there are things I'm not sure I do best-practice and definately would appreciate some qualified conversation.
daniel