When to think about using Cloud for your service

Three years ago I wrote “The Cloud is overrated!“. Since then I joined Google as an SRE, and I’ve been asking myself if Cloud does make sense for me or not. Even before joining Google, their GCP was my first option to go for Cloud; it’s seems quite good and of the three major providers (along with AWS and Azure) is the cheapest option. And let’s be fair, my main complain on Cloud is price. Vendor lock-in is my second concern and Google again seems to be the fairest of the three. Anyway, this isn’t about which is better but more about if when it’s a good idea.

Proper Cloud deployments are pricey and also require a lot of resources from developers; if it has to be done right, it’s not about deploying a WordPress inside a VPS style service on the cloud.

What is Cloud about?

Cloud is about having easy to use tools to deploy scalable and reliable applications without having to worry yourself on how to implement the details.

We need to think about scaling and zero downtime. These are the only two factors that will determine if you should pay the extra cost or not.

Everything else are extra services that they provide for you, such as Machine Learning. If you want to use those services, you could always setup the minimum on the given Cloud to make it work and call it from the outside, no problem. So these are out of my analysis here.

Vertical Scaling

When you deploy an application in a server, if later you need more resources you’ll need to migrate it to a different, beefier server if it no longer fits. In an VPS you have usually the option to upgrade it to have more compute resources as well.

In Cloud, the range of machines you can run code on is quite big. From tiny (1/2 CPU, 1 GiB RAM) to enormous (32 CPU, 512 GiB RAM). This gives quite the flexibility to keep growing any service as needed.

The other thing is that they allow for fast upgrade and downgrade, and also automate it. This can be used to reduce the cost overnight when there are less load. But be aware that even with this, it’s highly unlikely that you’ll get a cheaper option than a bare metal server.

Same as an VPS, Cloud services usually guarantee data consistency; no need to do maintenance or migrations because the disks fail. This is the downside of bare metal servers: you need to handle the maintenance and migrate to a new server if the disks start to fail, having risks of data loss.

Horizontal Scaling

This kind of scaling refers to splitting the service into different partial copies so they can work together, in parallel. This is specially needed when the service itself won’t run in a single machine.

The problem here is that most of the time applications are stateful, and this means that the state needs to be split or replicated across the different instances.

Cloud here helps by having database services and file sharing services that can do it for you, so your service can be stateless and leaves the complexity to the Cloud provider.

In Cloud, you can also spawn dynamically more instances of your services to handle the load.

Reducing downtime to zero

This is basically done by replicating data and services across different data centers. If one goes down, your service will be still up somewhere else.

This is the most important part I believe, so I’ll leave the details for later.

When should we think about using Cloud?

This is an important decision as it’s hard to convert a typical service (monolithic, single thing that does it all) into something that it’s going to make good use of the Cloud benefits. It’s better to do this on the design phase if possible.

Sharding

In the recent years has been a boom between the “Big Data” and Cloud, and everyone talks about NoSQL, sharding (horizontal scaling), etc. But all this has been just a lot of buzzwords, a way of looking cool. Is it really that cool for everyone?

All these things are meant for horizontal scaling (sharding), which means that we expect to use more than one machine for one of the services (i.e. database).

It sounds really cool, but it’s not really worth it for the majority of cases. Unless you have a big project on hand, chances are that it fits in an average server.

Why not use sharding anyway? well, it’s usually more expensive to have 5 machines running than a single one with all that power together. Sharding will impose a lot of design restrictions that are quite hard to handle, so it will substantially increase the time to develop the application. Unexpected requirements along the way will sometimes require a full redesign, because sharding requires certain premises to be true (how to split the service), and cannot be changed on the way without a lot of effort.

The other problem of sharding is that it’s always less efficient to use X machines than X threads. And X threads is less efficient than using a single-thread CPU X times more powerful. Parallelizing does not linearly scale, there’s a trade-off, always think about this.

Cloud is not (only) sharding, and sharding is not Cloud. If your service will never need to span more than one computer, there’s no point of adding the complexity.

I would recommend to plot a forecast of growth for your service for 5-10 years. Also plot the forecast for server growth, it usually increases 2x every two years (See Moore’s law). If your growth seems to be close to that, definitely you need to consider sharding from the start. Also think that there are periods of stagnation, where there are no improvements on certain areas for years.

If you go for sharding, the databases provided by the Cloud provider will make your life much easier, but they will be your vendor lock-in. Once the application is coded with a particular Cloud DB in mind, it will be quite hard to move away from that provider later. If this is a concern, look on how to make it generic enough, there are usually projects that let you change the DB or offer a plugin to connect to these DB, so you can swap later with less effort.

If you doubt, go for sharding. If you already need >25% of the biggest machine available, go for sharding. Better safe than sorry.

Replication

For me, here lies what applies to most applications and companies: How much is worth your downtime? How much is worth your data?

A server can fail, an entire data center can be struck by a lightning or engulfed in flames. Assuming you have your backups off-site, how much data is lost in this scenario? hours, a day, or week? How much time will be needed to get it back and running in a new server?

For example, in a server I use for a personal project I do a on-site database backup every two days, and a off-site full disk backup every day. This means that I can have one or two days of data loss. But if it happens, it will take me 5 days to get it up and running (because it’s a weird setup and I can only use my spare time). In this case the downtime and the data is almost worth zero, as it generates no revenue for me while it costs money. Still, the amount of time that would be needed to set it back up is something that I need to fix.

To minimize these scenarios we use replication. This will always be off-site replication. Sharding must be in-site (same DC) and Replication is better if it is off-site.

If you use sharding while managing the database, you can choose to have a fraction of the servers for redundancy. In this case, N+2 is always recommended. If you need 5 servers to handle the load, have 7 so at least 2 servers can fail. When using RAID yourself, I would recommend RAID 6. In most cases this will not apply.

Regardless, you need a full working copy elsewhere. Here you can go N+1 or N+2. Having another set of servers far away that are running the software in parallel avoids having an outage that can last weeks.

When using Cloud you can take advantage of the huge network between the different data centers. That is, they usually have another network that it’s not internet that is blazing fast and small on ping times that you can use to communicate between them, making real-time replication across the servers possible. Anyway, don’t go crazy and don’t set up the different servers very far away, as fast as those networks they can be, they still have to obey physics and are tied to the speed of light limit (no kidding here, light travels roughly at 50% of c on fiber and this can be used to estimate ping times)

If you want to use a regular ISP with VPS services, check if they also have an internal network interconnecting the data centers; this is starting to become the norm lately.

The problem with replication is that the cost for running the service is now 2x or 3x, as you need way more space and servers than before.

If cost is a problem, I would recommend to do only primary + “warm” read-only secondary. This means that all writes go to primary, and the secondary is only writing back those changes at real time. In an incident, you might lose seconds of data that might have not been written to the secondary yet. If this is a problem, you can look if the database allows for waiting until the secondary confirms the data is there. This will come with a huge penalty on write speed and latency.

The secondary could be smaller than the primary, or be used for other stuff. Only writing back data uses a very small amount of resources (but the same amount on disk space). In this case, if the secondary needs to be promoted to primary is possible that it suffocates on the amount of load, and the application would be almost unavailable until a new server is turned up. So it’s best to avoid having small secondaries if possible as this approach only serves to back up data with a resolution of seconds, but it will not be good enough for taking over.

On Cloud, they can also automate this replication for your database and files, and even automate the change from secondary replica to primary when things fail. Sharded databases do this best.

My final thoughts

I find Cloud products prohibitively expensive for my personal projects, adding proper replication makes them even more out of reach.

But on the other hand, I find extremely difficult to properly prepare automation for replica and takeover. These things are difficult to do and to test to ensure they will not hurt instead of helping.

So it seems that either there is not much money involved and the risk of data loss or downtime is not a big deal, or it actually offsets and then Cloud seems to have a price that is quite justified.

In the end this is about if you want to take the risks yourself or you want to pay extra so someone else deals with it. Generally I would go with the second and rest easy.

What if cryptocurrencies were used to perform useful work?

With BitCoin using more than 140TWh per year or 15GW and growing, we must ask ourselves, are they really worth that much? Are they providing any useful work?

15GW is not that much globally speaking, but to put this into perspective a Nuclear Power plant on average produces 1GW, therefore this means that we need 15 nuclear plants to keep mining BitCoins.

I have never been a believer of BitCoin and similar, per-se they have a lot of costs and don’t provide that much usefulness. The idea surely is interesting, and I really like the concept of decentralising the money from banks, entities and governments, but the cost is currently just too high.

Also we need to keep in mind that money is anything that we give a value and we desire to exchange for goods, and with that, almost anything can be used as long is not perishable or easily obtained or duplicated.

“Almost anything” is certainly not 15 nuclear power plants in cost. Also, if people don’t switch to use the currency it is of no use. The amount of goods that can be purchased with cryptocurrencies is certainly dim.

A currency should also retain its value over time, and the volatility of the crypto market is so high that holding onto crypto can be either extremely profitable or completely wet paper from one day to the next. Product pricing having to change every day or hour is not something that anyone wants to do.

Chia is another cryptocurrency that is getting famous on the last months. The idea of being way more “eco-friendly” by not wasting so much energy and instead requiring disk space is somewhat encouraging. This, of course, has led to retailers to increase on prices of HDDs as they saw a surge in demand. And it’s still not without its cost, as it still consumes a lot of power, just way less than BitCoin or others using proof of work.

I feel that cryptocurrencies give most of their value from their features, as smart contracts and similar. Ethereum is one of the most cited for these and Chia also has their own set of features.

The energy and monetary cost of running crypto should be justified with useful work they provide. Regular paper money provides useful work, by removing the burden of trading goods for goods; this is also true for crypto but it’s not enough by several orders of magnitude.

Some features could aid on some legal aspects that would reduce human effort in a lot of areas, but this needs to be used by governments or accepted to be of any use. And usually governments are decades behind on tech stuff so I don’t see this happening on the near term. Also the fact that they have to put their trust on something they don’t manage sounds quite a blocker to me.

In short, the amount of money saved by doing something using crypto has to overcome the energy cost by a good margin. If not, it’s not a good solution. It’s that simple.

To give an example, computers for accounting purposes had to become cheaper than doing the same thing manually; if that weren’t the case we would be doing it with pen and paper. It’s not because “it’s convenient” or “faster”, it’s because having a human doing the same tasks costs a lot more than purchasing and owning a computer. As for speed, it’s also because time is money and you can translate it back. Having the right information faster, flawless, has a value and you can put a price to it.

So I think of blockchain systems as something still very cool but also very immature. It will get there, but unless something revolutionary happens in the middle, it will still take a lot of years to see wide usage. It came ahead of its time and probably we’re not there yet to profit from them.

(At the current moment https://chiacalculator.com/ reports that 1 PiB of space would gain $62,000 per month, investing less than $20,000 in a server; this is so ridiculous that I expect it to be corrected by supply in the next months. In fact users in r/Chia already report no gains from it; the amount of people entering because of the investment prospects is probably saturating the network and make it really hard to win anything)

An idea came to my mind recently

…and most probably is either stupid or unfeasible. I don’t have much background on blockchain and not enough maths to go for it. But in case it inspires someone, here it is.

Chia network basically seems to make servers store “trash data” to prove they actually allocated the space, thereby the proof of space (yes, I know it’s much more complicated, but I love oversimplifying).

I was thinking… what if instead of storing crap data they actually stored customer data?

Chia has recently reached the 1 Exabyte of storage. Storing someone’s data has a value. And selling that capacity can be worth millions, specially in Cloud scenarios.

A decentralised storage run by users has already a name and it’s called P2P; some implementations being BitTorrent.

But those networks relied on the willingness of users to serve files for free, and nowadays is mostly used to combine bandwidth of several servers so the download can get the fastest transfer possible.

Instead, what I’m talking about is more in-line on this famous Linus Torvalds quote:

Only wimps use tape backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it 😉

Torvalds, Linus (1996-07-20). Message. linux-kernel mailing list. Source: Wikiquote

Can you upload your backups to BitTorrent and rely on others mirroring it? no. Heck, even FTP is no longer an option as no one uses it anymore.

Imagine a network where we could send data for retention and pay for that storage in cryptocurrency. Anybody could setup a server for it and be in the pool to store anybody’s else data for money.

The most basic usage of this network would be backups. Upload a backup, put a price on it, and people would start replicating it to cash that money. The more money you put, more replicas will be worth doing. You don’t need that old backup anymore? Stop paying for it. It will be gone in days as they began to find more profitable data to store.

Of course, anything that you upload would be public. So if you don’t want it to be public, you need to encrypt all the data with a secret key. The encryption used in particular would be the uploader’s choice; they might want to use symmetric or asymmetric encryption (although asymmetric is more risky because it has inherently more attack vectors).

The price for storage would fluctuate like the stock market does. As more people jump into the pool, the price would fall down until it no longer makes economic sense to do so. And as more people uploads more data into it the price per replica would rise up.

You don’t want to pay for the space you’re using elsewhere? It’s fine! just enter the network with the same server with the extra space you’ll need and you’ll be getting crypto for it at the current value, for which you can exchange to get your data saved elsewhere. If the price of storage goes up, so does your profits from storing other people data. Now you don’t need to pay for different servers in different regions to guarantee that the data will be recoverable if your only server fails. You could also use your home computer to do this to do the exchange if you like. Or tell your computer to prioritize your own data.

This idea could be expanded to a lot of interesting use cases, but at first glance it has several problems:

  • You don’t know what you’re storing, or from whom. This could mean that your server might contain illegal material without knowing it. But hopefully the payment is trackable.
    • ISPs and others also offer storage and can’t really check what’s on it, specially if it’s encrypted. I guess the law could track the payment and pursue the uploader/cut off the payment if that was a problem.
  • A single machine/location could try to claim that it has >1 copies of the same data, which in reality it’s pointless.
    • Filtering based on IP might not work as a machine can have >1 IP.
    • Ping time analysis to check that replicas are far apart could be tricked by having lots of small servers that actually fetch the data from the same place.
    • Encrypting with different keys the same data could ensure that the data is effectively copied. But it’s burdensome and anyway the network requires at least 2 plain copies to be able to verify that the other end actually contains the data.

I guess that I’m missing other risks and problems. And it’s also possible that they have some form of workaround; As I said, I’m not any kind of expert on these kind of systems to be able to outline a solution myself.

Nonetheless this seems an idea worth exploring. It’s possible that the usage could be extended outside of backups, for data that it’s modified often.

If all the state of an application could be stored in a network like this, then everything that requires to be deployed is basically stateless and can leverage Cloud very easily and cheaply. This, of course, would mean that a database can be run and modified quickly in this way, which is no easy feat.

But circling back to the beginning, a network like this would deliver actual work with actual value that would overcome the cost of running it. Therefore it will give use to the coin, and create a market based on supply-demand, not on speculation.

So as I said, it’s just an idea that crossed my mind. What are your thoughts? Seems interesting to you?

Rust vs Python: Rust will not replace Python

I love Python, I used it for 10+ years. I also love Rust, I have been learning it for the last year. I wanted a language to replace Python, I looked into Go and became disappointed. I’m excited about Rust, but it’s clear to me that it’s not going to replace Python.

In some parts, yes. There are small niches where Rust can be better than Python and replace it. Games and Microservices seem ones of the best candidates, but Rust will need a lot of time to get there. GUI programs have also a very good opportunity, but the fact that Rust model is too different from regular OOP makes it hard to integrate with existing toolkits, and a GUI toolkit is not something easy to do from scratch.

On CLI programs and utilities, Go is probably to prevent Rust from gaining some ground here. Go is clearly targeted towards this particular scenario, is really simple to learn and code, and it does this really well.

What Python lacks

To understand what are the opportunities from other languages to replace Python we should first look to what are the shortfalls of Python.

Static Typing

There are lots of things that Python could improve, but lately I feel that types are one of the top problems that need to be fixed, and it actually looks it’s fixable.

Python, like Javascript, is completely not typed. You can’t easily control what are the input and output types of functions, or what are the types of local variables.

There’s the option now to type your variables and check it with programs like MyPy or PyType. This is good and a huge step forward, but insufficient.

When coding, having IDE autocompletion, suggestions and inspection helps a lot when writing code, as it speeds up the developer by reducing round-trips to the documentation. On complex codebases it really helps a lot because you don’t need to navigate through lots of files to determine what’s the type that you’re trying to access.

Without types, an IDE is almost unable to determine what are the contents of a variable. It needs to guess and it’s not good. Currently, I don’t know of any autocompletion in Python solely based on MyPy.

If types were enforced by Python, then the compiler/interpreter could do some extra optimizations that aren’t possible now.

Also, there’s the problem of big codebases in Python with contributions of non-senior Python programmers. A senior developer will try to assume a “contract” for functions and objects, like, what are the “valid” inputs for that it works, what are valid outputs that must be checked from the caller. Having strict types is a good reminder for not so experienced people to have consistent designs and checks.

Just have a look on how Typescript improved upon JavaScript by just requiring types. Taking a step further and making Python enforce a minimum, so the developer needs to specify that doesn’t want to type something it will make programs easier to maintain overall. Of course this needs a way to disable it, as forcing it on every scenario would kill a lot of good things on python.

And this needs to be enforced down to libraries. The current problem is that a lot of libraries just don’t care, and if someone wants to enforce it, it gets painful as the number of dependencies increase.

Static analysis in Python exists, but it is weak. Having types enforced would allow to better, faster, and more comprehensive static analysis tools to appear. This is a strong point in Rust, as the compiler itself is doing already a lot of static analysis. If you add other tools like Cargo Clippy, it gets even better.

All of this is important to keep the codebase clean and neat, and to catch bugs before running the code.

Performance

The fact that Python is one of the slowest programming languages in use shouldn’t be news to anyone. But as I covered before in this blog, this is more nuanced than it seems at first.

Python makes heavy use of integration with C libraries, and that’s where its power unleashes. C code called from Python is still going at C speed, and while that is running the GIL is released, allowing you to do a slight multithreading.

The slowness of Python comes from the amount of magic it can do, the fact that almost anything can be replaced, mocked, whatever you want. This makes Python specially good when designing complex logic, as it is able to hide it very nicely. And monkey-patching is very useful in several scenarios.

Python works really well with Machine Learning tooling, as it is a good interface to design what the ML libraries should do. It might be slow, but a few lines of code that configure the underlying libraries take almost zero time, and those libraries do the hard work. So ML in Python is really fast and convenient.

Also, don’t forget that when such levels of introspection and “magic” are needed, regardless of the language, it is slow. This can be seen when comparing ORMs between Python and Go. As soon as the ORM is doing the magic for you, it becomes slow, in any language. To avoid this from happening you need an ORM that it’s simple, and not that automatic and convenient.

The problem arises when we need to do something where a library (that interfaces C) doesn’t exist. We end coding the actual thing manually and this becomes painfully slow.

PyPy solves part of the problem. It is able to optimize some pure python code and run it to speeds near to Javascript and Go (Note that Javascript is really fast to run). There are two problems with this approach, the first one is that the majority of python code can’t be optimized enough to get good performance. The second problem is that PyPy is not compatible with all libraries, since the libraries need to be compiled against PyPy instead of CPython.

If Python were stricter by default, allowing for wizardry stuff only when the developer really needs it, and enforcing this via annotations (types and so), I guess that both PyPy and CPython could optimize it further as it can do better assumptions on how the code is supposed to run.

The ML libraries and similar ones are able to build C code on the fly, and that should be possible for CPython itself too. If Python included a sub-language to do high-performance stuff, even if it takes more time to start a program, it would allow programmers to optimize the critical parts of the code that are specially slow. But this needs to be included on the main language and bundled on every Python installation. That would also mean that some libraries could get away with pure-python, without having to release binaries, which in turn, will increase the compatibility of these with other interpreters like PyPy.

There’s Cython and Pyrex, which I used on the past, but the problem on these is that it will force you to build the code for the different CPU targets and python versions, and that’s hard to maintain. Building anything on Windows is quite painful.

The GIL is another front here. By only allowing Python to execute a instruction at once, threads cannot be used to distribute pure python CPU intensive operations between cores. Better Python optimizations could in fact relief this by determining that function A is totally independent of function B, and allowing them to run in parallel; or even, they could build them into non-pythonic instructions if the code clearly is not making use of any Python magic. This could allow for the GIL to be released, and hence, parallelize much better.

Python & Rust together via WASM

This could solve great part of the problems if it works easy and simple. WebAssembly (WASM) was thought as a way to replace Javascript on browsers, but the neat thing is that creates code that can be run from any programming language and is independent of the CPU target.

I haven’t explored this myself, but if it can deliver what it promises, it means that you only need to build Rust code once and bundle the WASM. This should work on all CPUs and Python interpreters.

The problem I believe it is that the WASM loader for Python will need to be compiled for each combination of CPU, OS and Python interpreter. It’s far from perfect, but at least, it’s easier to get a small common library to support everything, and then other libraries or code to build on top of it. So this could relief some maintenance problems from other libraries by diverting that work onto WASM maintainers.

Other possible problem is that WASM will have it hard to do any stuff that it’s not strictly CPU computing. For example, if it has to manage sockets, files, communicate with the OS, etc. As WASM was designed to be run inside a browser, I expect that all OS communication would require a common API, and that will have some caveats for sure. While the tasks I mentioned before I expect them to be usable from WASM, things like OpenGL and directly communicating with a GPU will surely have a lack of support for long time.

What Rust Lacks

While most people will think that Rust needs to be easier to code, that it is a complex language that it requires a lot of human hours to get the code working, let me heavily disagree.

Rust is one of the most pleasant languages to code on when you have the expertise on the language. It is quite productive almost on the level of Python and very readable.

The problem is gaining this expertise. Takes way too much effort for newcomers, especially when they are already seasoned on dynamic-typed languages.

An easier way to get started in Rust

And I know that this has been said a lot by novice people, and it has been discussed ad-infinitum: We need a RustScript language.

For the sake of simplicity, I named RustScript to this hypothetical language. To my knowledge, this name is not used and RustScript does not exist, even if I sound like it does.

As I read about others proposing this, please keep reading as I already know more or less what has been proposed already and some of those discussions.

The main problem with learning Rust is the borrow-checking rules, (almost) everyone knows that. A RustScript language must have a garbage collector built in.

But the other problem that is not so talked about is the complexity of reading and understanding properly Rust code. Because people come in, try a few things, and the compiler keeps complaining everywhere, they don’t get to learn the basic stuff that would allow them to read code easily. These people will struggle even remembering if the type was f32, float or numeric.

A RustScript language must serve as a bootstrapping into Rust syntax and features of the language, while keeping the hard/puzzling stuff away. In this way, once someone is able to use RustScript easily, they will be able to learn proper Rust with a smaller learning curve, feeling familiar already, and knowing how the code should look like.

So it should change this learning curve:

Into something like this:

Here’s the problem: Rust takes months of learning to be minimally productive. Without knowing properly a lot of complex stuff, you can’t really do much with it, which becomes into frustration.

Some companies require 6 months of training to get productive inside. Do we really expect them also to increase that by another 6 months?

What it’s good about Python it’s that newcomers are productive from day zero. Rust doesn’t need to target this, but the current situation is way too bad and it’s hurting its success.

A lot of programming languages and changes have been proposed or even done but fail to solve this problem completely.

This hypothetical language must:

  • Include a Garbage Collector (GC) or any other solution that avoids requiring a borrow checker.
    Why? Removing this complexity is the main reason for RustScript to exist.
  • Have almost the same syntax as Rust, at least for the features they have in common.
    Why? Because if newcomers don’t learn the same syntax, then they aren’t doing any progress towards learning Rust.
  • Binary and Linker compatible with Rust; all libraries and tooling must work inside RustScript.
    Why? Having a complete different set of libraries would be a headache and it will require a complete different ecosystem. Newcomers should familiarize themselves with Rust libraries, not RustScript specific ones.
  • Rust sample code must be able to be machine-translated into RustScript, like how Python2 can be translated into Python3 using the 2to3 tool. (Some things like macro declarations might not work as they might not have a replacement in RustScript)
    Why? Documentation is key. Having a way to automatically translate your documentation into RustScript will make everyone’s life easier. I don’t want this guessing the API game that happens in PyQT.
  • Officially supported by the Rust team itself, and bundled with Rust when installing via RustUp.
    Why? People will install Rust via RustUp. Ideally, RustScript should be part of it, allowing for easy integration between both languages.

Almost any of these requirements alone is going to be hard to do. Getting a language that does everything needed with all the support… it’s not something I expect happening, ever.

I mean, Python has it easier. What I would ask to Python is way more realizable that what I’m asking here, and yet in 10 years there’s just slight changes in the right direction. With that in mind, I don’t expect Rust to ever have a proper RustScript, but if it happens, well, I would love to see it.

What would be even better is that RustScript were almost a superset of Rust, making Rust programs mostly valid in RustScript, with few exceptions such as macro creation. This would allow developers to incrementally change to Rust as they see fit, and face the borrow checker in small amounts, that are easy to digest. But anyway, having to declare a whole file or module as RustScript would still work, as it will allow devs to migrate file by file or module by module. That’s still better than having to choose between language X or Y for a full project.

Anyway, I’d better stop talking about this, as it’s not gonna happen, and it would require a full post (or several) anyways to describe such a language.

Proper REPL

Python is really good on it’s REPL, and a lot of tools make use of this. Rust REPL exist, but not officially supported, and they’re far from perfect.

A REPL is useful when doing ML and when trying out small things. The fact that Rust needs to compile everything, makes this quite useless as it needs boilerplate to work and every instruction takes time to get built interactively.

If Rust had a script language this would be simpler, as a REPL for scripting languages tends to be straightforward.

Simpler integration with C++ libraries

Given that both Rust and Python integrate only with C and not C++ would make anyone think that they are on the same level here; but no. Because Python’s OOP is quite similar to C++ and it’s magic can make for the missing parts (method overloading), in the end Python has way better integration with C++ than Rust.

There are a lot of ongoing efforts to make C++ integration easier in Rust, but I’m not that sure if they will get at any point something straightforward to use. There’s a lot of pressure on this and I expect it to get much, much better in the next years.

But still, the fact that Rust has strict rules on borrowing and C++ doesn’t, and C++ exceptions really don’t mix with anything else in Rust, it will make this hard to get right.

Maybe the solution is having a C++ compiler written in Rust, and make it part of the Cargo suite, so the sources can be copied inside the project and build the library for Rust, entirely using Rust. This might allow some extra insights and automation that makes things easier, but C++ is quite a beast nowadays, and having a compiler that supports the newest standards is a lot of work. This solution would also conflict with Linux distributions, as the same C++ library would need to be shipped twice in different versions, a standard one and a Rust-compatible one.

Lack of binary libraries and dynamic linking

All Rust dependencies currently rely on downloading and building the sources for each project. Because there so many dependencies, building a project takes a long time. And distributing our build means installing a big binary that contains everything inside. Linux distributions don’t like this.

Having pre-built libraries for common targets it would be nice, or if not a full build, maybe a half-way of some sort that contains the most complex part done, just requiring the final optimization stages for targeting the specific CPU; similar to what WASM is, *.pyc or the JVM. This would reduce building times by a huge amount and will make development more pleasant.

Dynamic linking is another point commonly overlooked. I believe it can be done in Rust but it’s not something that they explain on the regular books. It’s complex and tricky to do, where the regular approach is quite straightforward. This means that any update on any of your libraries require a full build and a full release of all your components.

If an automated way existed to do this in Cargo, even if it builds the libraries in some format that can’t be shared across different applications, it could already have some benefits from what we have. For example, the linking stage could take less time, as most of the time seems to be spent trying to glue everything together. Another possible benefit is that as it will produce N files instead of 1 (let’s say 10), if your application has a way to auto-update, it could update selectively the files needed, instead of re-downloading a full fat binary.

To get this to work across different applications, such as what Linux distributions do, the Rust compiler needs to have better standards and compatibility between builds, so if one library is built using rustc 1.50.0 and the application was built against 1.49.0, they need to work. I believe currently this doesn’t work well and there are no guarantees for binary compatibility across versions. (I might be wrong)

On devices where disk space and memory is constrained, having dynamic libraries shared across applications might help a lot fitting the different projects on such devices. Those might be microcontrollers or small computers. For our current desktop computers and phones, this isn’t a big deal.

The other reason why Linux distributions want these pieces separated is that when a library has a security patch, usually all it takes is to replace the library on the filesystem and you’re safe. With Rust applications you depend on each one of the maintainers of each project to update and release updated versions. Then, a security patch for an OS instead of being, say, 10MiB, it could be 2GiB because of the amount of projects that use the same library.

No officially supported libraries aside of std

In a past article Someone stop NodeJS package madness, please!!, I talked about how bad is the ecosystem in JavaScript. Because everyone does packages and there’s no control, there’s a lot of cross dependency hell.

This can happen to Rust as it has the same system. The difference is that Rust comes with “std”, which contains a lot of common tooling that prevents this from getting completely out of hand.

Python also has the same in PyPI, but turns out that the standard Python libraries cover a lot more functionality than “std”. So PyPI is quite saner than any other repository.

Rust has its reasons to have a thin std library, and probably it’s for the best. But something has to be done about the remaining common functionality that doesn’t cover.

There are lots of solutions. For example, having a second standard library which bundles all remaining common stuff (call it “extra_std” or whatever), then everyone building libraries will tend to depend on that one, instead of a myriad of different dependencies.

Another option is to promote specific libraries as “semi-official”, to point people to use these over other options if possible.

The main problem of having everyone upload and cross-depend between them is that these libraries might have just one maintainer, and that maintainer might move on and forget about these libraries forever; then you have a lot of programs and libraries depending on it unaware that it’s obsolete from long ago. Forking the library doesn’t solve the problem because no one has access to the original repo to say “deprecated, please use X”.

Another problem are security implications from doing this. You depend on a project that might have been audited on the past or never, but the new version is surely not audited. In which state is the code? Is it sound or it abuses unsafe to worrying levels? We’ll need to inspect it ourselves and we all know that most of us would never do that.

So if I were to fix this, I would say that a Rust committee with security expertise should select and promote which libraries are “common” and “sane enough”, then fork them under a slightly different name, do an audit, and always upload audited-only code. Having a group looking onto those forked libraries means that if the library is once deprecated they will correctly update the status and send people to the right replacement. If someone does a fork of a library and then that one is preferred, the security fork should then migrate and follow that fork, so everyone depending on it is smoothly migrated.

In this way, “serde” would have a fork called something like “serde-audited” or “rust-audit-group/serde”. Yes, it will be always a few versions behind, but it will be safer to depend on it than depending on upstream.

No introspection tooling in std

Python is heavy on introspection stuff and it’s super nice to automate stuff. Even Go has some introspection capabilities for their interfaces. Rust on the other hand needs to make use of macros, and the sad part is that there aren’t any officially supported macros that makes this more or less work. Even contributed packages are quite ugly to use.

Something that tends to be quite common in Python is iterating through the elements of a object/struct; their names and their values.

I would like to see a Derive macro in std to add methods that are able to list the names of the different fields, and standardize this for things like Serde. Because if using Serde is overkill for some program, then you have to cook these macros yourself.

The other problem is the lack of standard variadic types. So if I were to iterate through the values/content of each field, it becomes toilsome to do and inconvenient, because you need to know in advance which types you might receive and how, having to add boilerplate to support all of this.

The traits also lack some supertraits to be able to classify easily some variable types. So if you want a generic function that works against any integer, you need to figure out all the traits you need. When in reality, I would like to say that type T is “int-alike”.

Personal hate against f32 and f64 traits

This might be only me, but every time I add a float in Rust makes my life hard. The fact that it doesn’t support proper ordering and proper equality makes them unusable on lots of collection types (HashMaps, etc).

Yes, I know that these types don’t handle equality (due to imprecision) and comparing them is also tricky (due to NaN and friends). But, c’mon… can’t we have a “simple float”?

On some cases, like configs, decimal numbers are convenient. I wouldn’t mind using a type that is slower for those cases, that more or less handles equality (by having an epsilon inbuilt) and handles comparison (by having a strict ordering between NaN and Inf, or by disallowing it at all).

This is something that causes pain to me every time I use floats.

Why I think Rust will not replace Python

Take into account that I’m still learning Rust, I might have missed or be wrong on some stuff above. One year of practising on my own is not enough to have enough context for all of this, so take this article with a pinch of salt.

Rust is way too different to Python. I really would like Rust to replace my use on Python but seeing there are some irreconcilable differences makes me believe that this will never happen.

WASM might be able to bridge some gaps, and Diesel and other ORM might make Rust a better replacement of Python for REST APIs in the future.

On the general terms I don’t see a lot of people migrating from Python to Rust. The learning curve is too steep and for most of those replacements Go might be enough, and therefore people would skip Rust altogether. And this is sad, because Rust has a lot of potentials on lots of fronts, just requires more attention than it has.

I’m sad and angry because this isn’t the article I wanted to write. I would like to say that Rust will replace Python at some point, but if I’m realistic, that’s not going to happen. Ever.

References

https://blog.logrocket.com/rust-vs-python-why-rust-could-replace-python/

https://www.reddit.com/r/functionalprogramming/comments/kwgiof/why_do_you_think_data_scientists_prefer_python_to/glzce8e/?utm_source=share&utm_medium=web2x&context=3

Actix-web is dead (about unsafe Rust)

Update 2020-01-20: Actix oficial web repository is back and the maintainer has stepped down. Actix will continue to be maintained.

Recently the maintainer of Actix webserver took down the GitHub repository and left the code in his personal repository, deleting lots of issues, enraging a lot of people. He left a post-mortem:

https://github.com/actix/actix-web/blob/7f39beecc3efb1bfdd6a79ffef166c09bf982fb0/README.md

What happened? I did my own read of the postmortem, and from Reddit I also found this article which summarizes the situation pretty well:

https://words.steveklabnik.com/a-sad-day-for-rust

To summarize it in a few words in case you don’t feel like reading those: Rust community is heavily focused on a safe use of Rust where proper memory handling can be proven. For Rust, unless you use the “unsafe” keyword, the compiler guarantees no memory errors in a provable way, so usually for those small parts where the compiler is unable to prove the code, it’s okay to use “unsafe”. The remaining code should be small and easy to prove correct.

Actix was found by third parties abusing unsafe and when they were auditing most libraries found for Rust on the internet. When the unsafe code was audited it was found that on misuse, it can lead to serious vulnerabilities. So they opened a bunch of issues and added a lot of patches and PR’s in GitHub.

The response from the maintainer was that he doesn’t care, didn’t accept almost any of the patches, deleted the issues and the conversation heated up a lot and finally he deleted the repository itself from the official source and left it under his own username.

This is sad. Actix was known by its amazing speed on different benchmarks and was used by a lot of people. While it’s bad that the community sometimes is too harsh and some people lacks a lot of politeness (which makes maintainer life really hard), I’m going to be polemic here and say: It’s good that this happened and Actix-web got deleted.

I have been using Actix-web, seduced by its speed and I never thought I could be promoting a vulnerable webserver. I was assuming that because the library was coded on Rust, the author was taking care of not using unsafe where possible. But I was so wrong. Luckily I had other things to do and never released the article where I was going to promote Actix-web. Now I’ll have to redo the benchmarks before releasing anything.

The same happened for lots of other people, and all those uses combined, Actix-web has increased the surface area of attack for a lot of deployments.

I would have argued in other cases that for certain use cases, having software that prioritizes speed to security is good on certain scenarios where the inputs or the environment is not exposed to the internet. But this is a webserver, it’s main job is to serve as a facade for the internet. But even the project documentation never mentioned this aspect that the target was just to make the fastest webserver even if that meant to sacrifice security.

There’s no point on running Actix-web behind anything to reduce its potential problems: It is several times faster than raw Nginx or Apache serving static content. Adding anything on front will slow it down a lot. Also, there’s no reason to use it for internal networks: If it’s just serving HTTP to internal users, any web server will do, as internal networks have much less traffic. If it’s used to pipe commands along several machines, then HTTP is just a bad choice. use RPC’s instead like gRPC.

To be completely fair let me state that Actix-web never had a real issue as far as I know. It’s just that its correctness cannot be proven. Is this a problem? For me, yes, because if I wanted otherwise I would go with C or C++ instead. There are lots of really good, really fast web servers using raw C++. The point of using Rust in the first place is having memory guarantees, like using Java but without paying the performance penalties.

I understand that the maintainer just wanted to have fun with coding and there’s nothing wrong with that. But when your product starts getting recommended by others you have to care. This can’t be avoided: with great powers come great responsibilities.

This is the good thing with the Rust community. They’re fully committed to even inspect the sources of every single library out there and help by reducing the amount of unsafe code, even patching them.

It’s sad that the repository has been “deleted”, but this is good for Rust. Quality needs to be there and definitely they need to prevent unsafe code from gaining ground. There’s no point of having Rust if most libraries you can practically use are memory unsafe.

To conclude this: please be polite, everyone. It’s quite hard when you get a ton of people bashing at you. But also, keep the good job up!

ORM should be used in every project

When it comes to perform database queries in an application I hear things from “I stick with SQL-92 so I’m not stuck with one database” to “Everyone should drop ORM’s because they don’t use the full database potential and are slower”. And I believe both points are wrong. Every project should use an ORM: Change my mind.

Sticking with SQL-92, a 25 years old standard (even older, do the math) is not going to make your application run against any database. Things like character escaping, string quoting and name quoting have different requirements on different databases. Also, no database implements a SQL standard fully and properly; some products are more strict than others, but even then there are always some corners where the standard has to be bent a bit to fit the database design. If you do this, you’re going to get stuck with one database anyway, and your application is going to be slower because you would be rejecting to use any powerful feature. For me, this sounds like someone that doesn’t know SQL well and doesn’t want to learn.

Dropping the ORM entirely and go only with raw SQL, using all features that your chosen database has will give you the most performance that you can get, but you’ll end with a lot of very specific SQL everywhere that are hard to translate to other products. This is fine by me, but what worries me is that your code will heavily depend on the specific query performed, so you’ll end with a lot of code duplication.

I have seen this lots of times. In one part of the program we need to display users, so there’s a query and a template. In other parts of the program there’s users by post, so there’s another query and another part of glue code. And again and again there’s more and more glue code everywhere to do the same thing from different sources. Even sometimes we needed to emulate records in the database that aren’t created by then, so there’s more glue code.

Proper approach to database querying

The best solution is to go with a fast ORM that allows use of Raw SQL when needed, in parts or the full query. If the performance is good enough with the default ORM implementation, go with it. Whenever it falls short, try to add some raw strings in the clause where you need it. If for some case the ORM is too slow anyway, then go for full raw query just for those cases.

As I said before, I’m not specially worried of getting stuck with one database product. When I design the application I carefully choose the best possible option and usually I don’t need to change it later. The problem comes on unit testing testing.

Setting up a full-blown database like MySQL or PostgreSQL is hard, and eat lots of resources. SQLite on the other hand offers a really good SQL support and it can be created on memory, so it’s really convenient.

When using an ORM, most of the logic can be tested against SQLite. Even I found that my raw queries with CTE expressions are understood by SQLite, and that’s amazing. For the remaining ones, because they’re just a few, they can have either another raw query just for SQLite or we can mock the call entirely. Even we could have another test folder for integration tests and in those we could use a full PostgreSQL server for test purposes.

But this is not the main benefit of an ORM. And no one should expect being able to use ORM without knowing SQL language properly.

The main benefit of ORM is “Don’t Repeat Yourself” or DRY. Avoiding to write duplicated logic depending on context.

ORM are really powerful in this matter, as they create objects for us and when we create functions for those objects we don’t need to know which data was queried actually, or under which column names. So functions can be reused everywhere to perform the same logic across the application. This is for me my main motivation for using ORM.

Sometimes we can query partial data for a table, and the receiving function would never notice. Sometimes we can prefetch related data and the function will use it just fine without knowing it.

I believe that there is a relation or proportion between code and bugs. The more code you have, more chances of having bugs in it. Having your code as concise as possible makes the intention clearer and is simpler to reason, inspect, and prove correct.

In other cases, a bigger SQL is simpler to prove correct because SQL is closer to a declarative or functional language and expresses terms in mathematical relations that are also a good tool to prove correctness. But if when the results land on your code you still have to do complex calculations on it, then you’re doing something wrong or the problem is really difficult to solve while being fast.

Some ORM have the ability to map raw SQL to objects and we should use these to avoid duplicating logic again and again.

The main problem with ORM is performance. Everywhere I saw developers saying that ORM performance does not matter, we should move the work in the database anyway. This is true and false at the same time. From one side, yes, we should move all possible calculations to the database unless there is a legitimate reason to not to do so. But on the other part, ORM performance does matter.

Guess what happens when we use 10000 rows from a complex database query in most ORMs: The database handles the load, with filters and joins neatly and quickly, responding in 10ms or less. The ORM can add another 10ms on top on some cases, making the responses twice as slow.

This is the main reason why so many people is claiming against ORM, because the different libraries fail to realize how important is to be fast and they become the main bottleneck. They fail to see that databases are not fast, they’re blazing fast: they can handle millions of records in seconds. Any slowdown in our application and we’ll be impacted heavily.

Some clear indicators on how overlooked performance is are the lack of compiled Python ORMs (using C for the parts where speed is required). I also compared Django (Python) against GOrm (Go) and while GOrm queries slightly faster, Django seems to retrieve slightly more rows per second. How’s this even possible if Python is 10x slower? This is because most or all Orm in Go use reflection (like introspection in Python) instead of generating specific code for the types; Go ends doing much more work than it should.

Common Missing Features in ORM

We have a lot of features to mimic more and more stuff from SQL, but as stated before, performance is missing. No matter which ORM I always find the following features missing or incomplete: query plan caching, result caching, proper cache eviction tools and the ability to inject generated/mock results.

Query Plan Caching

There’s an associated cost converting the ORM representation of the query to the database dialect SQL. When a query is complex enough and is executed often, this becomes a small bottleneck.

But most of the query is usually stated already in code, and in fact, at compile-time it could be partially generated for a particular dialect speeding up later calls.

Some databases allow as well prepared statements. On those, you just prepare the calls at the beginning and afterwards you only need to provide the required data, without nay effort. This not only saves you from building a SQL each time, it also saves time to the database as it does not have to parse and plan the query each time.

Result caching

Not only saving the cost of creating a SQL or unmarshalling the records, caching results also saves a database query with its round-trip. Most of the time I had to do it manually without much help from the ORM which is exhausting.

Cache eviction tooling

So we cached the data, good. But now it’s stale: not so good. The ORM in fact knows or has methods to know if the application itself invalidated the data. Because has knowledge on which tables are touched on each query or update, they could easily perform a cache eviction mechanism à la MySql, where whenever a table is modified, all caches using that table are evicted. It’s really simple and it does remove a lot of useless queries.

For external modifications outside the app there are also tricks, like leveraging columns like last_modified, reading internal statistics of the database, or having an external communication channel that allows us to tell the app to evict cache for a certain table.

How many ORM implement this idea? Zero. Good luck implementing this yourself without touching the ORM.

Inject generated results

This pattern is a bit more convoluted. Imagine we have a function that works with data from a query. Now we want to use it without querying. In most cases replacing the query with a list of created objects is good enough.

But what if the function expects to have working relationships? For example in Django, with select_related or prefetch_related. Those are near to impossible to mock up properly.

And now to the final boss: Perform a query to database and fill some Many-to-Many relationships without querying, from cache or some generated data. This makes more sense than might seem. The typical use-case scenario is: We want to output a list of products, but with it a huge bunch of extra data from relationships which is very costly to fetch and process. On the website, we want the products to be fresh, but for the extra data we really don’t care that much as it rarely changes and it’s barely noticeable. So we want to join a fresh query with a potentially stale cache. If the previous example was near impossible, imagine how “easy” is this one.

ORMs are needed but a pain point

Some people might think that I’m obsessed with performance, but I believe that those that have seen me working would agree that I only focus on performance when actually is important. I do not care unless I see potential problems.

We should be using an ORM to avoid code duplication and having logic concentrated in a single point. But when doing that we lose a lot of performance. They also lack tooling to avoid calculating things twice, which makes things even worse.

Still I would recommend everyone using one, because it’s better than raw queries in so many senses. They can provide correctness, slimmer code and reduced chances of bugs. Just hope that someday they’ll have features to make caching easier.

At the moment I’m trying to benchmark different ORM over Python, Go and Rust and hope I can get it out in some weeks.

Future of Rust and Go

Having used for a bit both languages I start to see where they fit and which market will they replace. Where we can see Rust and Go in the next 10 years? Will they replace any language?

Rust

Being still too new and quite difficult to start with, Rust will take its time to get its place, but I have no doubt that it will.

Rust directly competes with C++, and it’s better in every way, having proper protections against memory misuse. C++ in the other hand has a head start of 27 years, so the amount of code and libraries present in C++ will be a huge thing to overcome. But being Rust easier to work with I believe it will take over at some point.

For C, depending on the program Rust could be seen as a replacement. If the application was big and complex, Rust can add abstraction without extra costs while staying fast. There is also Rust for Embedded systems. Rust has a lot of potential in this area, but looks to me that it will be way more difficult for it to gain ground against C.

For Java, it might or might not make sense. Rust compiler is really picky, which can be seen as an advantage for projects that require some sanity; But Java has more ways of protecting against misuses of code than Rust. In those cases where Java was used for speed, Rust might be compelling, specially because the memory footprint of Rust is really small. But nowadays Java is used for the availability of developers on the market and this is a big problem in Rust.

On scripting languages like Javascript, Python, PHP or Perl, Rust does not make sense because it requires too much effort to code compared to those.

Scientific oriented languages like F# also won’t see any benefit on Rust as it is too technical for the current audiences of those.

Functional languages like Haskell might seem some use in Rust, as it supports a lot of the functional approach while having blazing speed. Also, functional languages usually are picky on type coercion and other stuff, so Rust won’t be that surprising. But generally, Rust is too low level for those audiences.

Products that we might see using Rust

  • Browsers: Don’t forget Rust was created by Mozilla for Firefox to speed up development while guarding against memory errors.
  • Virtual machines: Rust is an ideal candidate as it can control interrupts and it’s even prepared for embedded systems.
  • Databases: They need blazing speed, small footprints, abstraction and thread safety. Rust brings all of those.
  • Game development: Speed is crucial here and Rust will enable modding in a fast, safe way.
  • Kernels and drivers: Did I say that Rust supports embedded? Also it should be able to create kernels. There is some traction on allowing Rust drivers in Linux already.
  • WebAssembly: Rust is pioneering Wasm support, and being the fastest language it makes sense to use it. Not surprising, as Mozilla is interested on this. Will we see web frameworks like Angular written on Rust someday?
  • Basic OS tools like GNU: Those are written in C or C++ currently and Rust is easier to hack with. Sadly, this is an area that does not require any improvement so I doubt anyone will try to rewrite them. But a Bash implemented in Rust can be interesting.

As stated before, Rust has a steep learning curve and don’t expect to see any of these in the mid term. We will have to wait for long before seeing the real potential of Rust.

Go

The biggest advantage of Go is its simplicity to read and write code. Go is taking over already and we will see a lot of applications based on it.

Its main competitors are scripted languages like Python, PHP, Perl and so on. Having more or less the same productivity and adding a nice type check will allow projects to grow in an organized way. Go is really opinionated and this helps preventing a mess of code when the team is big and people coming in and out.

For applications using C++ and C it will not see much benefit of using Go, as it is a bit slower, and compared to C++ it does not offer a good level of abstractions. Only those using C without grabbing all its performance and those using C++ with a really basic set of features will see a benefit moving to Go.

Java developers will probably not want to change to Go. It offers roughly the same speed with less abstraction and less checks. Not having proper OOP and exceptions will produce too much friction to move. The only benefits are the smaller memory footprint and avoiding deploying the JVM on the final system, which will make sense on containers and serverless only.

For scientific and functional languages Go doesn’t offer much, so I don’t expect anyone adopting Go from these.

Products that we might see using Go

  • OS tooling: Currently Bash, Perl and Python cope the majority of scripts used in Linux to perform the boot process as well other tasks like maintenance. Go will make these easier to grow and faster, delivering some performance gains.
  • Docker: This one already uses Go!
  • REST API: Go seems the ideal language, as there are speed gains and ease of development.
  • Backend frameworks and CMS: I don’t see why not. Go can generate HTML templates as any other language.
  • Applications for containers and serverless: The ease of deploying Go, threading support and it’s speed are ideal for these environments.
  • WebAssembly: While still experimental and behind of Rust, if it gets better, the ease of use of Go will make very appealing for people to write Wasm using Go.

The biggest issue I see with Go for scripting and Web is that we’re used to mess with the code as it runs. If Operating Systems start deploying binaries for these we might see a lack of freedom to hack the code, as you have to download it separately and rebuilt.

As Go can be cheaply built and the result can be cached, someone might come with solutions to run it like we do with scripts, like the answers in this StackOverflow questions.

For me, I will keep learning on both Go and Rust. Surely, I will give Go some use in the near term, but for Rust I’ll have to keep learning for some time before giving it an actual use.

What about you? Do you plan to give Rust or Go a try?

Benchmarking Python vs PyPy vs Go vs Rust

Since I learned Go I started wondering how well it performs compared to Python in a HTTP REST service. There are lots and lots of benchmarks already out there, but the main problem on those benchmarks is that they’re too synthetic; mostly a simple query and far from real world scenarios.

Some frameworks like Japronto exploit this by making the connection and the plain response blazing fast, but of course, as soon as you have to do some calculation (and you have to, if not what’s the point on having a server?) they fall apart pretty easily.

To put a baseline here, Python is 50 times slower than C++ on most benchmarks, while Go is 2-3 times slower than C++ on those and Rust some times even beats C++.

But those benchmarks are pure CPU and memory bound for some particular problems. Also, the people who submitted the code did a lot of tricks and optimizations that will not happen on the code that we use to write, because safety and readability is more important.

Other type of common benchmarks are the HTTP framework benchmarks. In those, we can get a feel of which languages outperform to others, but it’s hard to measure. For example in JSON serialization Rust and C++ dominate the leader board, with Go being only 4.4% slower and Python 10.6% slower.

In multiple queries benchmark, we can appreciate that the tricks used by the frameworks to “appear fast” no longer are useful. Rust is on top here, C++ is 41% slower, and Go is 43.7% slower. Python is 66.6% slower. Some filtering can be done to put all of them in the same conditions.

While in that last test which looks more realistic, is interesting to see that Python is 80% slower, which means 5x from Rust. That’s really really far better from the 50x on most CPU benchmarks that I pointed out first. Go on the other hand does not have any benchmark including any ORM, so it’s difficult to compare the speed.

The question I’m trying to answer here is: Should we drop Python for back-end HTTP REST servers? Is Go or Rust a solid alternative?

The reasoning is, a REST API usually does not contain complicated logic or big programs. They just reply to more or less simple queries with some logic. And then, this program can be written virtually with anything. With the container trend, it is even more appealing to deploy built binaries, as we no longer need to compile for the target machine in most cases.

Benchmark Setup

I want to try out a crafted example of something slightly more complicated, but for now I didn’t find the time to craft a proper thing. For now I have to fall back into the category of “too synthetic benchmarks” and release my findings up to this point.

The base is to implement the fastest possible for the following tests:

  • HTTP “Welcome!\n” test: Just the raw minimum to get the actual overhead of parsing and creating HTTP messages.
  • Parse Message Pack: Grab 1000 pre-encoded strings, and decode them into an array of dicts or structs. Return just the number of strings decoded. Aims to get the speed of a library decoding cache data previously serialized into Redis.
  • Encode JSON: Having cached the previous step, now encode everything as a single JSON. Return the number characters in the final string. Most REST interfaces will have to output JSON, I wanted to get a grasp how fast is this compared to other steps.
  • Transfer Data: Having cached the previous step, now send this data over HTTP (133622 bytes). Sometimes our REST API has to send big chunks over the wire and it contributes to the total time spent.
  • One million loop load: A simple loop over one million doing two simple math operations with an IF condition that returns just a number. Interpreted languages like Python can have huge impact here, if our REST endpoint has to do some work like ORM do, it can be impacted by this.

The data being parsed and encoded looks like this:

{"id":0,"name":"My name","description":"Some words on here so it looks full","type":"U","count":33,"created_at":1569882498.9117897}

The test has been performed on my old i7-920 capped at 2.53GHz. It’s not really rigorous, because I had to have some applications open while testing so assume a margin of error of 10%. The programs were done by minimal effort possible in each language selecting the libraries that seemed the fastest by looking into several benchmarks published.

Python and PyPy were run under uwsgi, sometimes behind NGINX, sometimes with the HTTP server included in uwsgi; whichever was faster for the test. (If anyone knows how to test them with less overhead, let me know)

The measures have been taken with wrk:

$ ./wrk -c 256 -d 15s -t 3 http://localhost:8080/transfer-data

For Python and PyPy the number of connections had to be lowered to 64 in order to perform the tests without error.

For Go and Rust, the webserver in the executables was used directly without NGINX or similar. FastCGI was considered, but seems it’s slower than raw HTTP.

Python and PyPy were using Werkzeug directly with no url routing. I used the built-in json library and msgpack from pip. For PyPy msgpack turned out to be awfully slow so I switched to msgpack_pypy.

Go was using “github.com/buaazp/fasthttprouter” and “github.com/valyala/fasthttp” for serving HTTP with url routing. For JSON I used “encoding/json” and for MessagePack I used “github.com/tinylib/msgp/msgp”.

For Rust I went with “actix-web” for the HTTP server with url routing, “serde_json” for JSON and “rmp-serde” for MessagePack.

Benchmark Results

As expected, Rust won this test; but surprisingly not in all tests and with not much difference on others. Because of the big difference on the numbers, the only way of making them properly readable is with a logarithmic scale; So be careful when reading the following graph, each major tick means double performance:

Here are the actual results in table format: (req/s)


HTTPparse mspencode jsontransfer data1Mill load
Rust128747.615485.435637.2019551.831509.84
Go116672.124257.063144.3122738.92852.26
PyPy26507.691088.88864.485502.14791.68
Python21095.921313.93788.767041.1620.94

Also, for the Transfer Data test, it can be translated into MiB/s:


transfer speed
Rust2,491.53 MiB/s
Go2,897.66 MiB/s
PyPy701.15 MiB/s
Python897.27 MiB/s

And, for the sake of completeness, requests/s can be translated into mean microseconds per request:


HTTPtransfer dataparse mspencode json1Mill load
Rust7.7751.15182.30177.39662.32
Go8.5743.98234.90318.031,173.35
PyPy37.72181.75918.371,156.761,263.14
Python47.40142.02761.081,267.8147,755.49

As per memory footprint: (encoding json)

  • Rust: 41MB
  • Go: 132MB
  • PyPy: 85MB * 8proc = 680MB
  • Python: 20MB * 8proc = 160MB

Some tests impose more load than others. In fact, the HTTP only test is very challenging to measure as any slight change in measurement reflects a complete different result.

The most interesting result here is Python under the tight loop; for those who have expertise in this language it shouldn’t be surprising. Pure Python code is 50x times slower than raw performance.

PyPy on the other hand managed under the same test to get really close to Go, which proves that PyPy JIT compiler actually can detect certain operations and optimize them close to C speeds.

As for the libraries, we can see that PyPy and Python perform roughly the same, with way less difference to the Go counterparts. This difference is caused by the fact that Python objects have certain cost to read and write, and Python cannot optimize the type in advance. In Go and Rust I “cheated” a bit by using raw structs instead of dynamically creating the objects, so they got a huge advantage by knowing in advance the data that they will receive. This implies that if they receive a JSON with less data than expected they will crash while Python will be just fine.

Transferring data is quite fast in Python, and given that most API will not return huge amounts of it, this is not a concern. Strangely, Go outperformed Rust here by a slight margin. Seems that Actix does an extra copy of the data and a check to ensure UTF-8 compatibility. A low-level HTTP server probably will be slightly faster. Anyway, even the slowest 700MiB/s should be fine for any API.

On HTTP connection test, even if Rust is really fast here, Python only takes 50 microseconds. For any REST API this should be more than enough and I don’t think it contributes at all.

On average, I would say that Rust is 2x faster than Go, and Go is 4x faster than PyPy. Python is from 4x to 50x slower than Go depending on the task at hand.

What is more important on REST API is the library selection, followed by raw CPU performance. To get better results I will try to do another benchmark with an ORM, because those will add a certain amount of CPU cycles into the equation.

A word on Rust

Before going all the way into developing everything in Rust because is the fastest, be warned: It’s not that easy. Of all four languages tested here, Rust was by far, the most complex and it took several hours for me, untrained, to get it working at the proper speed.

I had to fight for a while with lifetimes and borrowing values; I was lucky to have the Go test for the same, so I could see clearly that something was wrong. If I didn’t had these I would had finished earlier and call it a day, leaving code that copies data much more times than needed, being slower than regular Go programs.

Rust has more opportunities and information to optimize than C++, so their binaries can be faster and it’s even prepared to run on crazier environments like embedded, malloc-less systems. But it comes with a price to pay.

It requires several weeks of training to get some proficiency on it. You need also to benchmark properly different parts to make sure the compiler is optimizing as you expect. And there is almost no one in the market with Rust knowledge, hiring people for Rust might cost a lot.

Also, build times are slow, and in these test I had always to compile with “–release”; if not the timings were horribly bad, sometimes slower than Python itself. Release builds are even slower. It has a nice incremental build that cuts down this time a lot, but changing just one file requires 15 seconds of build time.

Its speed it’s not that far away from Go to justify all this complexity, so I don’t think it’s a good idea for REST. If someone is targeting near one million requests per second, cutting the CPU by half might make sense economically; but that’s about it.

Update on Rust (January 18 2020): This benchmark used actix-web as webserver and it has been a huge roast recently about their use on “unsafe” Rust. I’m had more benchmarks prepared to come with this webserver, but now I’ll redo them with another web server. Don’t use actix.

About PyPy

I have been pleased to see that PyPy JIT works so well for Pure Python, but it’s not an easy migration from Python.

I spent way more time than I wanted on making PyPy work properly for Python3 code under uWSGI. Also I found the problem with MsgPack being slow on it. Not all Python libraries perform well in PyPy, and some of them do not work.

PyPy also has a high load time, followed by a warm-up. The code needs to be running a few times for PyPy to detect the parts that require optimization.

I am also worried that complex Python code cannot be optimized at all. The loop that was optimized was really straightforward. Under a complex library like SQLAlchemy the benefit could be slim.

If you have a big codebase in Python and you’re wiling to spend several hours to give PyPy a try, it could be a good improvement.

But, if you’re thinking on starting a new project in PyPy for performance I would suggest looking into a different language.

Conclusion: Go with Go

I managed to craft the Go tests in no time with almost no experience with Go, as I learned it several weeks ago and I only did another program. It takes few hours to learn it, so even if a particular team does not know it, it’s fairly easy to get them trained.

Go is a language easy to develop with and really productive. Not as much as Python is, but it gets close. Also, it’s quick build times and the fact that builds statically, makes very easy to do iterations of code-test-code, being attractive as well for deployments.

With Go, you could even deploy source code if you want and make the server rebuild it each time that changes if this makes your life easier, or uses less bandwidth thanks to tools like rsync or git that only transfer changes.

What’s the point of using faster languages? Servers, virtual private servers, server-less or whatever technology incurs a yearly cost of operation. And this cost will have to scale linearly (in the best case scenario) with user visits. Using a programming language, frameworks and libraries that use as less cycles and as less memory as possible makes this year cost low, and allows your site to accept way more visits at the same price.

Go with Go. It’s simple and fast.

HTTP Pipelining is useless

…and please stop publishing benchmarks with Pipelining enabled. It’s just lying about real-world performance.

Today I just found out that one of my favorite sources for HTTP framework benchmarks is indeed using pipelining to score the different programming languages and frameworks and I’m mad about it:

https://www.techempower.com/benchmarks/

The first time I saw this was with Japronto, which claimed one freaking million of requests per second, and of course this wasn’t replicable unless you had a specific benchmarking method with pipelining enabled.

Before HTTP/2 I was in favor of pipelining because we were so limited on parallel requests and TCP connections were so costly that it made sense. Now, with H2 supported on all major browsers and servers, pipelining should be banned from benchmarks.

What is HTTP pipelining?

In classic HTTP/1, we had to open a TCP connection for a single request. Open the socket, send the request, wait for response and close the socket. TCP connections have a big cost to open, so this was a real problem back in the days.

With HTTP/1.1 we had keep-alive, where after the request was completed, we can feed another request on the same TCP socket. This alleviated the problem. But still, if your computer is far from the server (usually is), the server will sit idle waiting for the last packet sent to arrive to your computer, then waiting for your next request back. In most servers this is 80ms of delay from one request to the following one.

So here enters the scenario the named HTTP pipelining, where we could send another request before the response was received, effectively queuing the requests on the server and receiving them in order. Wikipedia has a nice graph on this:

This looks nice, but HTTP/1.1 never got Pipelining working; it was there, not mandatory, with some clients and servers supporting it; but as it seemed that most web servers at the moment were failing to reply properly with pipelining, and there was no reliable way for the client to tell if the server actually supports pipelining, all major browsers didn’t add the support at all. What a shame!

It was a really good idea, but then HTTP/2 came with multiplexing and this problem vanished. There are still challenges in this area, but nothing that Pipelining will solve. So now, we’re happy with multiplexing.

HTTP/2 does not have Pipelining

This is a common misunderstanding. Yes, you can send several requests; even several thousands without waiting to receive anything. This is really good, but it’s not pipelining. Why?

Pipelining, as it’s name implies, acts like a pipe: First In, First Out. The request will be queued in the server in order and will be replied in order.

HTTP/2 has instead multiplexing, which seems similar, but better. Multiplexing means that you get several streams inside one at the same time, so you can receive data as it is produced. The requests are not queued and are not returned in the same order. They come back at the same time.

Why pipelining gives so good results

Because it’s equivalent to copy a file over the network, specially under synthetic benchmarks where localhost is the target, pipelining reduces a lot of effort to get the different packets.

Instead of grabbing a packet for a request and processing it, you can let it buffer, then grab a big chunk in one go that might contain hundreds of requests, and reply back without caring at all if the client is getting the data or not.

Even more, as the benchmark is synthetic, servers might know beforehand what to serve more or less, reducing time for look what is requested and just replying back approximately the same data again and again.

The benchmark clients also do way less effort, because they only need to fill a connection with the same string repeated millions of times.

If you think about it carefully, this is even faster than copying files over localhost: You don’t even need to read a file in the first place.

HTTP/2 multiplexing is slower

Compared to pipelining, of course. Because you’re not serving a clear stream of data but thousands of interleaved streams, your server has to do more work. This is obvious.

Of course, we could craft a cleartext HTTP/2 server that does multiplexing in effectively one single stream, replying in order. This will result in closer performance to pipelining because it’s indeed pipelining.

But this will be naive to be implemented on a production site, as the same applies if HTTP/1.1 pipelining was a thing. HTTP/2 proper multiplexing is far superior in real world scenarios.

And my question is, do you want your benchmark to return higher results or do you want your users to have the best experience possible?

Because if you only care on benchmarks, maybe is just easier to change the benchmark so it returns better results for your servers, right?

Pipelining will not help serve more requests

I can hear some of you saying “If we enable Pipelining in our production, we will be able to serve millions of results!”. And… surprise!

Why? you might ask. Well, depending on the scenario the problem is different, but it will conclude always to the same two: You need to be able to reply out-of-order to avoid bottlenecks and a single user will never cause thousands of pipelined requests like your benchmark tool.

First pitfall: Requests are heterogeneous, not homogeneous.

Requests will not have the same size, nor reply size. They will have different computing times or wait times to reply. Does your production site reply with a fortune cookie for every single request? Even CSS and JPEG queries? No, I don’t think so.

Why this matters? Well, say your client is asking for a CSS and a JPEG for the same page and you’re replying back with pipelining. If the JPEG was requested first, the CSS will stall until the image completed, making the page not render for some time.

Imagine now we have a REST API, and we get thousands of requests from a client. One of the requests contains an expensive search on the database. When that one is processed, the channel will sit idle and your client will be frozen.

Second pitfall: Real users will never pipeline thousands of requests.

Unless your site is really bad designed, you’ll see that more than 50 parallel request do not make much sense. I tried myself HTTP/2 with an Angular site aggressively sending requests for tiny resources, and the results were quite good, but less than 100 requests in parallel. And the approach was pretty stupid. Aside of this, popular servers and browser lack support for HTTP/1.1 pipelining, so enabling it in your product will not make any difference.

Let’s consider this for a second. Why do we want to pipeline in the first place? Because the client is far from the server and we want to reduce the impact of round-trip time. So, say our ping time to the server is 100ms (which is higher than the usual), and we pipeline 100 requests at a time.

Effectively, in one round-trip, we served 100 requests, so this equates to 1ms RTT per HTTP response. What haves 1ms RTT? Local network! So when you reach this parallelism, the client works as fast as from your local network given the same bandwidth is available. Try the same math for one thousand and ten thousand requests pipelined: 0.1ms and 0.01ms respectively.

So now the question is: Are you trying to save 0.9ms per request to the client, or are you just trying to get your benchmark numbers look better?

Scenario 1: API behind reverse proxy

Assume we have our shiny Japronto in port 8001 in localhost, but you want to serve it along the rest of the site, in port 80. So we put it behind a reverse proxy configuration; this might be Apache, Nginx or Varnish.

Here’s the problem: None of the popular web servers or reverse proxies support pipelining. In fact, even serving static data they will be slower than what your shiny pipelining framework claims it can do.

Even if they did, when they proxy the request, they don’t do pipeline on the proxied server either.

This approach renders pipelining useless.

Scenario 2: Main Web Server

So let’s put our framework directly facing public internet, over another port, who cares? We can send the requests from Angular/React/Vue to whatever port and the user will not notice. Of course this will add a bit of complexity as we need to add some headers here and there to tell the browser to trust our application running in a different port than the main page.

Nice! Does this work? Well, yes and no.

The main concern here is that we’re exposing a non well-tested server to the internet and this can be incredibly harmful. Bugs are most probably sitting there unnoticed, until someone actually notices and exploits them to gain access to our data.

If you want seriously to do that, please, put it inside a Docker container with all permissions cut down, with most mounting points as read only. Including the initial docker container image.

Did we enable HTTP/2 with encryption? If we’re lucky enough that our framework supports it, then it will consume extra CPU doing the encryption and multiplexing.

HTTP/2 over clear text does not work in any browser, so if you try, most users will just go with HTTP/1.1.

If we don’t use HTTP/2 at all, 99% of users have browsers that do not use pipelining at all.

For those cases where they do, the routers and hardware that makes internet itself work will mess up sometimes the data because they see HTTP in clear and they want to “manage” it because “they know the standard”. And they’re pretty old.

Scenario 3: Pipelining reverse proxy

I had an excellent idea: Let’s have our main web server to collect all requests from different users and pipeline them under a single stream! Then we can open several processes or threads to further use the CPU power and with pipelining, the amount of requests per second served will be astonishing!

Sounds great, and a patch to Nginx might do the trick. In practice this is going to be horrible. As before, we will have bottlenecks but now one user can freeze every other user because they asked a bunch of costly operations.

Conclusion 1

The only way this can work is if the framework supports HTTP/2 encrypted and is fast doing it. In this case you should have benchmarked frameworks with HTTP/2 multiplexing.

If your framework does not multiplex properly so indeed, pipelines the data, then users will see unexplainable delays under certain loads that are hard to reproduce.

Conclusion 2

In some scenarios, the client is not a user browser. For example for RPC calls if we implement the microservices approach. In this case, pipeline indeed works given the responses are homogeneous.

But, just it turns out that HTTP is not the best protocol for those applications. There are tons of RPC protocols and not all of these use HTTP. In fact, if you search for the fast ones, you’ll see that HTTP is the first thing they drop out.

I did in the past an RPC protocol myself called bjsonrpc. I wanted speed, and dropping HTTP was my main motivation to create it.

If you need HTTP for compatibility, just have two ports open, one for each protocol. Clients that can’t understand a specific protocol are likely to not understand pipelining either. Having a port for each thing will give you the best performance in the clients that support it while still allowing other software to connect.

Brief word on QUIC

The new old QUIC protocol by Google is being standarized at the moment by the IETF as a base for the future HTTP/3 protocol. QUIC does support fast encryption (less round trips) and has a lot of tolerance against packet loss as well supporting IP Address changes.

This is indeed the best protocol possible for RPC calls, except for its massive use of CPU compared to raw TCP. I really hope that someone standarizes a non-HTTP protocol on top of it aimed to application connections, to be supported by browsers.

The takeaway: managing the protocol takes a lot of CPU, we have to do it in production, and skipping part of that for some frameworks that support it is unfair for the others. Please be considerate and disable pipelining when publishing benchmarks, otherwise a lot of people will be taking the wrong decision based on YOUR results.

Sedice – Adding FTS with PostgreSQL was really easy

Last weekend I added an internal search for my forum at Sedice.com using Postgres Full Text Search and it was really easier than I ever thought. But let me tell you the full story in short.

This project was started in 2005 with a PHP Nuke site running over MySQL. As you can guess, by now, everything on that site is really, really outdated. I never had the time or people to replace it properly, but that’s a longer story for another post. It grew pretty quickly and its built-in search started to be really slow, we keep upgrading to bigger servers and never was enough. So we disabled it and replaced it by a Custom Google Search. Until now.

On the last months the users of the site started complaining that they no longer could find old posts using that search. The reason looks to be that the site is not widely used anymore, and Google no longer thinks it’s worth indexing the whole site.

To fix this I can’t enable again the old search, as it would be devastating. Also, on the last years, I downgraded the server to a really cheap one: less than 10 euro per month.

The only way out is to build a proper FTS service, and I thought it could take me weeks, as I’m no longer used to it and I only built those twice in the past, with PostgreSQL.

I have been really busy with other stuff and I didn’t want to start on this to leave it unfinished. I have some time now so I decided to give it a try.

First things I started researching which database to use. Reusing the same database (MySQL) would be handy, but it turns out that MySQL is known for not having a good FTS support. So it wasn’t an option.

I searched the web for the best options for FTS, but all I got was ElasticSearch and Lucene. No other mentions of other alternatives. It’s kind of sad because ElasticSearch is a fork of Lucene, so mostly we have only one recommended product.

A bit more of research revealed that they are based on Java, so weren’t an option for me as well. Java is going to take four times more CPU (at least) and several times more memory than a database written in C. I don’t have any JVM installed in my tiny server and I don’t particularly like the idea of installing it just for this.

So I went back to my dear PostgreSQL. Maybe is not the best for FTS but it has good support, it’s fast, and it is very friendly with memory usage.

My next question was if the data and indexes are going to fit on the server, but for that I needed to export a sample and perform some tests.

I started by spinning off a PostgreSQL 11 in my home computer and installing the MySQL foreign data wrapper (FDW). If FDW is something new to you, let me tell you that it is something really neat. It is a way to directly connect PostgreSQL to other databases and perform crossed queries. Of course these are not as performant as having everything in one database, but they’re really good for data import/export and some fine tuned queries as well.

https://github.com/EnterpriseDB/mysql_fdw

By just using the samples on the github main page I was able to setup foregin tables for topics, posts and posts_text (Nuke has a 1-1 relationship to separate text from posts). As querying those is slow, I created materialized views for them with indexes.

Materialized views are a mixture between a view and a table. They are stored like tables, but the data they contain comes from a view definition and can be refreshed at will:

CREATE MATERIALIZED VIEW mv_nuke_bbposts AS SELECT * FROM ft_nuke_bbposts;

This worked really nice and I could start playing around a bit, but it became clear that for posts_texts I was missing around 30% of the data. It seems that the FDW for MySQL has a limit (or a bug) on how much data can be transferred, so as it hits 500Mb more or less, it stops consuming data. So for that one I resorted to a rudimentary approach:

CREATE TABLE tmp_nuke_bbposts_text AS 
  SELECT * FROM ft_nuke_bbposts_text WHERE id < 300000;
INSERT INTO tmp_nuke_bbposts 
 SELECT * FROM ft_nuke_bbposts_text 
  WHERE id BETWEEN 300000 AND 600000;
INSERT INTO tmp_nuke_bbposts 
 SELECT * FROM ft_nuke_bbposts_text 
  WHERE id BETWEEN 600001 AND 900000;
INSERT INTO tmp_nuke_bbposts 
 SELECT * FROM ft_nuke_bbposts_text 
  WHERE id BETWEEN 900001 AND 1200000;
INSERT INTO tmp_nuke_bbposts 
 SELECT * FROM ft_nuke_bbposts_text 
  WHERE id BETWEEN 1200001 AND 1500000;

This worked and I could play around with it for a while. The original text table is 1 Gigabyte in MySQL but only 440Mb in PostgreSQL. So far so good.

Next step was reading PostgreSQL FTS docs and apply those recipes here:

https://www.postgresql.org/docs/11/textsearch-controls.html

I decided to create a GIN index over the text column parsed into tsvector:

CREATE INDEX ON tmp_nuke_bbposts_text 
    USING GIN (to_tsvector('spanish', post_text));

This created a 180 MegaByte index and I could search it really quickly:

SELECT post_id FROM tmp_nuke_bbposts_text
 WHERE to_tsvector('spanish', post_text) @@ plainto_tsquery('search term')

This took between 0.2ms and 6ms depending on the amount of hits, which is really awesome for searching 1Gb of text and 1.5 million entries. The problem is, we only get “hits” from this, not a particular order.

For a search to be most useful we need ranking. PostgreSQL has ts_rank and ts_rank_cd for those purpose. The first one looks for number of hits in the document while the other also accounts for the position of the words; by how closer they are to the original query string.

When I added this the performance dropped significantly to 600ms for a single query. This is because PostgreSQL has to convert the text to tsvector on the fly. While it could had used the actual value stored on the index, it might be not possible for a GIN index. Then, it becomes critical to store the conversion in its own column.

After creating the new column plus joining the select with posts and topics to get an idea of the finished product, the search took between 6ms and 15ms, which is really good.

I also considered about using GiST indexes instead of GIN, but after a quick read in the docs it seems that GIN is faster for reads and slow in updates, while GiST has faster updates but slower reads. While in some scenarios having faster updates is desirable, in our case we want to build something read-heavy, and we don’t care that much on update cost. Also, it wasn’t bad at all, as the index creation usually took less than a minute for the full thing using a single thread.

Still, I wasn’t sure if this approach was the best one, so I tried another design before settling on anything. How it will perform if I do index by thread and not by post text?

As I had already the tsvector built for the post_text table I concatenated them together grouping by thread_id and created a new table. PostgreSQL can concatenate tsvector types but does not have aggregate functions for tsvector so I cannot group them easily without writting my own functions. I was lazy, and it is just a test, so instead I converted the tsvector into an array, unnested it into rows, and then grouped and aggregated back into an array again, then to a tsvector. This loses the position information, but also will compact duplicate words into a single one:

CREATE TABLE fts_topics AS SELECT x.topic_id,  array_to_tsvector(array_agg(x.term)) as vector FROM
    (
        SELECT topic_id,  unnest(tsvector_to_array(vector)) as term  
        FROM fts_posts
        WHERE topic_id is not null
    ) x
WHERE x.term ~ '^[a-z]{3,30}$'
group by x.topic_id

Array support in PostgreSQL is really neat and saves me lots of time. I also removed some non-useful terms in the process. This is something I was planning for later, to export from Python the posts doing proper parsing. For now, this is more than good enough to test the idea.

This table turned out to be only 22Mb and the index 56Mb. I could get results from here in 0.2ms which was impressive, but the ranking functions weren’t useful anymore, as I wasn’t storing neither position or repetitions, every result ranked the same. I tested also to join this results with topics data and found out that the timings are really similar to the ones from indexing posts_text. And the problem is, if I’m not gaining any time from this, I still need to find the actual message which will take more space and time, so this design does not seem to be a good idea. So I went with indexing directly post text.

Experimenting a bit more, I noticed that I was still lacking a big important piece here for a proper search. I need a piece of the post in each result to show and highlight where the text was found, so the user can see at a glance if that is what they were searching for. PostgreSQL has a neat function for this called ts_headline, which also outputs <b> tags in html-like fashion, very convenient to use it with any website.

So I wrote a huge SQL that for a particular search string, outputs bbcode for PHP Nuke. The reason is, I wanted to demonstrate to users how this could look like in the final product by posting it as a forum message, without actually building it. Here is the monster:

SELECT array_to_string(array_agg(t2), '                       
') FROM (
SELECT '- - - [b]#' || (row_number() over ()) || ' (' || rank || '%) ' || '[url=https://www.sedice.com/modules.php?name=Forums&file=viewtopic&p=' || k.id || '#' || k.id || ']' || CASE WHEN LENGTH(k.subject) > 2 THEN k.subject ELSE t.topic_title END || '[/url][/b]
' || REPLACE(REPLACE(REPLACE(k.head, '<', '['), '>', ']'),'
', ' ') as t2
FROM (
SELECT (ts_rank_cd(vector,
  plainto_tsquery('spanish', t.text1))*100)::numeric(8,2) as rank,
  ts_headline('spanish', text_out, plainto_tsquery('spanish', t.text1), 'MaxFragments=1,MaxWords=100,MinWords=50') as head, *
    FROM fts_docs,
    (SELECT 'cazador de sueños' as text1) t
    WHERE vector @@ plainto_tsquery('spanish', t.text1)
ORDER BY rank DESC
LIMIT 5) k
INNER JOIN mv_nuke_bbtopics t ON t.topic_id = k.topic) j

Really ugly, but it got the job done. I could also see by myself how it will look like in the finished product.

Now it was clear that this is really possible and should be easy to deploy. The last challenge was to index the data properly on the server. For this matter I used my backend code for sedice.xyz (a Flask+Angular experiment for replacing sedice.com) to write a command that would read posts one by one and dump them into PostgreSQL after processing. Having SqlAlchemy already there helped a lot on this task.

The postprocessing was slow to my taste, around 1000 rows per second even after some optimization. I can’t do much more because for each one I have to conditionally process bbcode, html and emoticons; convert it to html, and then parse that html to retrieve the text part only. Maybe in other programming languages than Python this could be faster, but this is a one time task, after this it will run only for updates from time to time, so it is fine. I ended adding a trick, to parse only a fraction of those by filtering by “post_id % 8 == N”, so locally I could start eight processes to do it in parallel and use all my resources, so I didn’t had to wait that much.

I decided I would have only a single table in PostgreSQL with almost all data on it. The only remaining matter was the thread subject; I finally decided not to add it, as it is difficult to track thread edits while it is easy to track post edits. This will incur in one mysql database call per result, not ideal, but should be under my performance margins.

This table ended having 810Mb of data and 360Mb for the index. I redid the query for the new table and after checking that perfomance was still between my expected values I moved the code to the server to start indexing in there. On the server I didn’t want any mess so I disabled parallel load and just went out for beers while the load took place. At 1000 rows/s, should be done in 30-50min, but never checked.

When I got back I realized that it was almost done. All I needed was some web page that connects to both databases and performs the query, that’s it. While tempted of adding this into my Angular site, I got discouraged by the fact that I forgot to add proper support to post lookup. It has some, but it redirects to the last post of the thread instead, and it lacks paging yet. For the time being, pointing to the old sedice.com was the only reliable option.

So I went to create a damn simple PHP page, I used bulma.io as a starting point, just copying the template over and using the CDN. A bit of markup and logic, and voilà!, done:

https://www.sedice.com/sebusca/?search-text=mi+vecino+totoro

It has been a great weekend for coding and learning new stuff. PostgreSQL makes things easy and I’m always grateful to the project.

A few notes if you want to try this:

  • to_tsvector defaults to the database language if you don’t pass a dictionary. While this is neat, it makes the function “VOLATILE”, so it can’t be used in an index expression. Use “to_tsvector(‘spanish’, col1)’, as this one is “INMUTABLE”.
  • PostgreSQL knows the concept of stop words (that are meaningless for the search because they’re too common), lexemes (root parts of a word that convey the same meaning), synonyms and thesaurus. They come preconfigured for each language and they’re really powerful without further work, but they can actually be configured and this will improve the search speed and result quality.
  • PostgreSQL FTS does not have fuzzy search inbuilt. It uses lexemes and it does help, but if a user mistypes one letter it can return zero results. You can get around this by building a similarity search with pg_tgrm. It’s not hard to do, just more work. I skipped this step in this project but I used it other times.
  • You can also assign four weights to different texts, from A to D. This is useful to give titles more weight than content. It wasn’t a good idea in this particular case.
  • It does lack other cool features like attributes, you can “emulate” these by pushing manually crafted lexemes that can’t exist in its own, like “author:deavid”. It needs some manual effort. Also some of those things can also be avoided by having regular column searches.

Overall, working with PostgreSQL Full Text Search is a great experience with really good results. If your project already uses PostgreSQL for the data, consider using its FTS support before adding external database. Also if you are short on hardware like me, PostgreSQL is a great choice.

Servidores web más seguros con Docker (V)

Habiendo visto ya cómo instalar todo el servidor Lamp con FTP incluido, en este artículo que cierra la serie vamos a enfocarnos en seguridad y hacer las cosas bien para evitar posibles desastres.

Restringiendo permisos de los contenedores

Los contenedores que hemos creado suponen una mejora de seguridad considerable, pero aún pueden conectar todos con todos; y también pueden acceder a internet.

Vamos a cubrir principalmente:

  • Puntos de montaje de sólo lectura
  • Mezclar puntos de montaje de sólo lectura y de escritura para definir qué parte de la web se puede escribir
  • Hacer el contenedor (la imagen) de sólo lectura
  • Definir una topología de red restringida para que sólo puedan comunicarse entre ellos unos contenedor determinados y no tengan acceso a internet.

Volúmenes de solo lectura

Los puntos de montaje (volumes) en sólo lectura son bastante sencillos de implementar, sólo hay que agregar “:ro” al final de la cadena del punto de montaje para indicar que es sólo lectura.

¿Qué puntos de montaje pueden ser de sólo lectura? Prácticamente todos. Repasemos todos los que tenemos:

  • Nginx:
    • conf.d: no necesita escribir su propia configuración
    • /var/www: no necesita escribir en las webs
    • /cache: escritura. Necesita escribir las nuevas cachés y borrar las antiguas.
  • PHP (Apache y FPM):
    • /var/www/html: una web no debería “sobreescribirse” a sí misma
  • MySQL:
    • /var/lib/mysql: escritura. Necesita modificar las bases de datos.
  • PostgreSQL:
    • pgdata: escritura. Necesita modificar las bases de datos.
  • PhpMyAdmin:
    • /sessions: escritura. Necesita poder guardar
  • Python uWSGI:
    • foobar.py: no debe poder reescribirse a sí mismo.
  • SFTP y FTP:
    • /www: escritura. A no ser que la FTP sólo sirva para leer el contenido, claro.

Ahora, todas las que no son de escritura agregamos “:ro” al final, por ejemplo:

master-webserver:
image: nginx:1.14
volumes:
- ./master-webserver/sites-enabled:/etc/nginx/conf.d:ro
- ./www:/var/www:ro
- ./master-webserver/cache:/cache

El resto ya las cambiáis vosotros. Luego reiniciáis los servicios con “docker compose up -d” y probáis a ver si todo sigue funcionando.

Ahora bien, en el caso de las webs es muy habitual que se puedan subir ficheros como imágenes y adjuntos. Si está todo como solo lectura, el contenedor no va a poder escribir, y por tanto los usuarios no van a poder adjuntar nada.

¿Cómo solucionar ésto? Montando la carpeta que sea (por ejemplo uploads/) encima de la de solo lectura, pero en modo de escritura.

Por ejemplo, pongamos que el programa PHP que tenemos queremos que escriba un fichero dentro de una carpeta “uploads/”. Temporalmente quitamos el “:ro” de php-apache-1 y cambiamos el código a index.php:

<html>
 <head>
  <title>Prueba de PHP</title>
 </head>
 <body>
 <?php
$filename = './uploads/test.txt';
echo '<p>Hola Mundo</p>';
file_put_contents($filename, 'Time: ' . time() . "\n", FILE_APPEND);
echo '<pre>';
echo file_get_contents($filename);
echo '</pre>';

 ?>
 </body>
</html>

Hemos de crear la carpeta uploads/ y ésta debe tener como propietario el usuario www-data cuyo UID ha de ser 33. (Lo importante es el UID)

Como veis funciona y ya hay una restricción importante de permisos por usuario. Esto es algo que hay que tener en cuenta: bien hecho este “coñazo” de UID’s y de permisos evitará que hagan daño si todo lo demás falla.

Pero no es suficiente; no nos fiamos y queremos que esté montado como solo lectura. Simplemente agregamos otra línea con la carpeta específica, sin el flag “:ro”:

php-apache-1:
image: php:7.3-apache-stretch
volumes:
- ./www/php-apache-1:/var/www/html/:ro
- ./www/php-apache-1/uploads:/var/www/html/uploads/

Y ya funciona otra vez. Ahora queda meridianamente claro que no se puede escribir fuera de la carpeta de uploads. Ni aunque desde el contenedor obtenga permisos de root, o el empleado o cliente X por desconocimiento haga un “chmod 0777 . -R”, no se puede escribir en las carpetas no habilitadas. Esto es lo que yo llamo seguridad.

¿Pero qué pasa si la web es un Drupal o WordPress y necesitamos actualizarlo o instalar plugins? ¿Tenemos que hacerlo ahora a mano? ¿Y si el cliente se quiere instalar algo? No os preocupéis, os explico una posible solución en este mismo artículo, más abajo.

Contenedor de sólo lectura

Un atacante que consiga entrar en un contenedor, seguramente no conseguirá acceso de root y su acceso de usuario le limitará mucho en lo que puede y no puede hacer. Pero aún así, podría modificar el contenedor.

Si creemos que el contenedor tiene cambios, recrear el contenedor debería ser suficiente. Igualmente, los volumes pueden ser borrados y recreados a no ser que contengan información valiosa, como código de páginas web o bases de datos.

Mi recomendación es que tiréis de GIT para almacenar el código de las páginas web, excluyendo uploads y temporales. Los uploads y las bases de datos, haced backups frecuentes incrementales y completos; y guardad un histórico de copias de al menos 6 meses. De ese modo, si hubiera una modificación no autorizada en alguna de estas partes, tendríamos datos para intentar deshacer el entuerto.

Para el caso de los contenedores, ponerlos en solo lectura es una medida extrema que aportará apenas un 1% de seguridad comparado con otras prácticas más sencillas. Pero si la superficie de ataque que nos queda es un 2%, reducirla otro 1% supone doblar la seguridad.

Hacer que un contenedor sea solo lectura es muy sencillo. Sólo hay que agregar “read_only: true” a la descripción del servicio. Por ejemplo:

php-apache-1:
image: php:7.3-apache-stretch
read_only: true
volumes:
- ./www/php-apache-1:/var/www/html/:ro
- ./www/php-apache-1/uploads:/var/www/html/uploads/

El problema es que a la práctica esto no es tan sencillo. Esto es lo que ocurre al iniciar el contenedor:

Apache intenta escribir su PID al iniciarse y falla. Como resultado, el contenedor se detiene. Los programas suelen asumir que están en un sistema operativo con acceso de escritura, así que cuando no pueden escribir terminan “explotando” de formas variadas.

Cada imagen de Docker tiene sus handicaps para montarla en read_only. Lo mejor es leer la documentación de la imagen particular que queremos para ver si cubre cómo ejecutarla en read_only. En el caso de la de PHP, no tiene información.

Podemos tirar a base e prueba y error; vemos qué fichero falla al escribir en los logs y lo arreglamos. La forma de arreglarlo puede ser un volumen tmpfs. Estos volumenes son en memoria y se pierden en cada reinicio. Ideal para que aunque alguien consiga instalar algo allí, dure poco.

Agregaremos un tmpfs para /var/run/apache2; pero hay que tener en cuenta que /var/run es un enlace simbólico a /run, por lo que quedaría de la siguiente forma:

php-apache-1:
image: php:7.3-apache-stretch
read_only: true
volumes:
- ./www/php-apache-1:/var/www/html/:ro
- ./www/php-apache-1/uploads:/var/www/html/uploads/
tmpfs:
- /run/apache2/

Con esto probamos y vemos que funciona de nuevo. Parece que no era tan complicado como aparentaba.

Restringiendo el acceso a la red

Por defecto, todos los contenedores tienen acceso de red tanto a internet como al anfitrión, como a cualquier otro contenedor. Esto está muy bien para entornos de desarrollo, pero en producción que el contenedor de PHP pueda acceder al puerto 22 de otro contenedor SFTP dista mucho de lo ideal.

Que, como siempre, si las contraseñas son seguras no están reutilizadas y tal, un atacante que gana acceso a un contenedor no debería poder ganar acceso a otro; simplemente porque desconoce los usuarios y contraseñas.

Pero es que, además de que impedir las conexiones es una capa de seguridad muy buena, los contenedores tienen acceso a internet y al ganar acceso a un contenedor podría enviar SPAM, intentar atacar otros equipos en LAN o infectar equipos por internet. Aparte del impacto para el negocio que esto pueda suponer, un ISP u otra entidad podría apagar los servidores en caso de que vea actividad maliciosa. Así que no nos faltan razones para limitar al máximo las conexiones de red de los contenedores.

Hay dos modos de restringir las redes, uno es con Dockerfile. Personalizando la imagen, se puede hacer que al iniciar el contenedor ejecute una serie de comandos “iptables” para limitar las conexiones de red. La ventaja es que se puede hilar muy fino. La desventaja es que cada imagen tendríamos que personalizarla. Si os interesa, éste artículo lo explica de forma sencilla:

https://dev.to/andre/docker-restricting-in–and-outbound-network-traffic-67p

La segunda forma es usando las redes (networks) que docker-compose puede crear por nosotros. Esto es lo que haremos aquí. La ventaja es la sencillez de configuración. La desventaja es que sólo podemos definir subredes entre contenedores.

En nuestro caso, como no hay diferentes dominios diferenciados, sino que al final todos terminan conectando con todos (nginx a php, php a mysql, …), lo mejor será definir una subred por contenedor. Suena un poco a una barbaridad, pero nos permite limitar qué equipo conecta a cual usando “docker-compose.yml”.

Entonces, empezamos agregando una clave “networks:” al final del fichero. Tiene que estar a primer nivel (raíz) fuera de los servicios. En esta, agregaremos una entrada por contenedor, más una llamada “default” tal que así:

networks:
default: {internal: true}
master-webserver: {}
php-apache-1: {internal: true}
php-fpm-1: {internal: true}
mysql-1: {internal: true}
postgresql-1: {internal: true}
phpmyadmin: {internal: true}
pgadmin4: {internal: true}
python-uwsgi-1: {internal: true}
sftp-php-apache-1: {}
sftp-python-uwsgi-1: {}
ftpd-php-apache-1: {}

La red “default” es la que usan los contenedores que no especificamos red alguna. “internal: true” quiere decir que no tienen acceso a internet. De este modo, deshabilitamos internet a todos los contenedores que no la necesitan. Como regla general, si no tienen puerto enrutado al anfitrión, no necesitan internet.

Luego, a cada servicio agregamos una clave “networks:” y listamos las redes a las que puede acceder, que será a sí mismo y en algún caso a otro contenedor. El caso más importante es el de master-webserver:

master-webserver:
image: nginx:1.14
networks:
- master-webserver
- php-apache-1
- php-fpm-1
- pgadmin4
- phpmyadmin
- python-uwsgi-1
(...)

Pero por ejemplo php-apache-1 sólo necesita conectividad consigo mismo:

php-apache-1:
image: php:7.3-apache-stretch
networks:
- php-apache-1

Con esto conseguimos que Nginx pueda conectarse a php-apache-1 vía proxy http, pero que php-apache-1 no pueda conectarse a Nginx (en realidad sí puede si escanea IP’s, pero no puede conectarse a MySQL o a php-fcgi-1). Además, Nginx tiene acceso a internet, pero php-apache-1 no lo tiene.

Como referencia, este sería el de phpmyadmin:

phpmyadmin:
image: phpmyadmin/phpmyadmin:4.7
networks:
- phpmyadmin
- mysql-1

Ahora el hecho de separar todo en contenedores cobra aún más sentido. Gracias a esta separación, cuando se ejecuta PHP está totalmente aislado de la red. Poniendo al frente un servicio de confianza como es Nginx y escondiendo detrás el más vulnerable que es PHP, reduce enormemente la probabilidad de un hackeo exitoso.

Sobre docker y redes

Cuando cambias las redes y ya has levantado contenedores antes con redes personalizadas a menudo salen errores de que no se puede reconfigurar la red o que no la encuentra.

Si no puede reconfigurar la red hay que parar los contenedores con “docker-compose stop contenedor1”, luego examinar las redes con “docker network ls” y borrar la que hayamos cambiado con “docker network rm nombredelared”.

Después de esto, al lanzar “docker-compose up” hay que pasar el flag “–force-recreate” para que regenere los contenedores con la nueva red. Por ejemplo:

$ docker-compose stop master-webserver
$ docker network ls
NETWORK ID          NAME                              DRIVER              SCOPE
2ff8f2a1ce0f        docker-lamp_master-webserver      bridge              local
$ docker network rm docker-lamp_master-webserver
$ docker-compose up -d --force-recreate master-webserver

Como no podía ser de otra forma, el código fuente de este tutorial está en GitHub:

https://github.com/deavid/docker-lamp/tree/v0.12

Mejorando la gestión de passwords

Para el caso de FTP y SFTP, el hecho de que los passwords estén en texto claro, dentro del repositorio es un problema serio. Por ejemplo, si alguien usa el repositorio “tal cual”, estará exponiendo su máquina a ejecución de código de forma trivial: simplemente el atacante se loguea por SFTP con la contraseña por defecto, instala su software y lo ejecuta vía HTTP. Que sí, que está limitado a solo el contenedor; pero esto es muy peligroso.

O podemos imaginar también el escenario donde una empresa tiene su versión privada de este ejemplo también en Git, pero interno. Y un día un empleado por error hace push al remoto que no tocaba y termina en GitHub por unas semanas. Todas las contraseñas de todos los clientes publicadas en un santiamén.

Tenemos que ver qué podemos hacer para mitigar estos escenarios.

SFTP

En el caso de atmoz/sftp, permite especificar los passwords de forma cifrada. Para ello usaremos el programa “makepasswd” y al resultado agregaremos “:e” al final para especificar que está cifrado.

Por ejemplo:

deavid@debian:~$ makepasswd --crypt-md5 --clearfrom=-  
mipassword
mipassword   $1$wRMZ98PL$2z8jFQSQGvTEmfJYigDvW0

Ahora modificamos la configuración e incorporamos el password cifrado:

sftp-php-apache-1:
image: atmoz/sftp:debian-stretch
networks:
- sftp-php-apache-1
volumes:
- ./www/php-apache-1:/home/admin/php-apache-1
ports:
- "2201:22"
command: admin:$$1$$wRMZ98PL$$2z8jFQSQGvTEmfJYigDvW0:e:1000

Si os fijáis veréis que he doblado los símbolos de dólar. Esto es porque docker-compose tiene una sintaxis para reemplazar variables y si ve un dólar sin variable va a dar un error. El símbolo $$ es la forma de escapar $.

Con esto ya lo tenemos. No es que este password cifrado se pueda compartir, pero en caso de que pase, será bastante más difícil que alguien deduzca qué password es, entre otras cosas porque el mismo password genera distintas cadenas cada vez.

El programa “makepasswd” se puede instalar en Debian/Ubuntu con un “apt-get install makepasswd”. Podéis incluso instalarlo en las máquinas de desarrollo y no es necesario tenerlo en el servidor. Las contraseñas sirven igual.

FTP

Para la imagen de FTP no hay forma de cifrar la contraseña, que era bastante cómodo. Esta imagen deja los usuarios guardados permanentemente al estilo de MySql y PostgreSQL. Lo mejor será adaptarla para que funcione de forma similar:

ftpd-php-apache-1:
image: stilliard/pure-ftpd
networks:
- ftpd-php-apache-1
ports:
- "2101:21"
- "21010-21019:21010-21019"
expose:
- "21010-21019"
volumes:
- ./www/php-apache-1:/var/www/html/
environment:
- PUBLICHOST=localhost # Replace this with your server public IP
- FTP_PASSIVE_PORTS=21010:21019
- FTP_USER_HOME=/var/www/html/
- FTP_USER_NAME=admin
- FTP_USER_PASS

De este modo, el usuario no es creado hasta que en el arranque especificamos la variable de entorno “FTP_USER_PASS”. Hay que tener cuidado porque si arrancamos todos los contenedores a la vez y hay varios de FTP, si pasamos esta variable de entorno todos recibirán el mismo password.

Para evitar esto, podemos mapear un nombre personalizado que puede variar por contenedor:

environment:
- PUBLICHOST=localhost # Replace this with your server public IP
- FTP_PASSIVE_PORTS=21010:21019
- FTP_USER_HOME=/var/www/html/
- FTP_USER_NAME=admin
- FTP_USER_PASS=${FTP_PHP_APACHE_1_PASS-}

La variable en este caso es FTP_PHP_APACHE_1_PASS, el guión al final es para el valor por defecto, que es vacío. Principalmente para evitar un warning de Docker Compose si no existe. El contenedor de FTP es lo suficientemente listo como para no crear usuarios sin contraseña. Si no la pasamos, no crea nada.

Os podéis crear un fichero aparte con una serie de comandos “export” que definan todas las contraseñas y que podáis hacer “source ../passwords.profile”

El código de ejemplo lo tenéis aquí:

https://github.com/deavid/docker-lamp/tree/v0.13

Aplicación de administración con HttpAuth

Veamos cómo hacer un contenedor secundario para la misma web con más permisos, que requiera contraseña para poder acceder a cualquier página.

La idea es muy sencilla, duplicamos el contenedor, le damos acceso a internet y acceso de escritura a disco, y lo conectamos a Nginx. En Nginx agregamos Http Auth y opcionalmente limitamos por IP. Esta es la guía:

https://docs.nginx.com/nginx/admin-guide/security-controls/configuring-http-basic-authentication/

Como ejemplo, vamos a copiar el contenedor php-fpm-1 y le daremos como nombre php-fpm-1-admin. Acordaos de cambiar el nombre de la red y de quitar el sólo lectura al volumen:

php-fpm-1-admin:
networks:
- php-fpm-1-admin
- mysql-1
image: php:7.3-fpm-stretch
volumes:
- ./www/php-fpm-1:/var/www/html/
restart: always

Al apartado “networks” del final del fichero, agregamos nuestra red; pero sin especificar “internal: true” porque queremos que pueda acceder a internet:

networks:
default: {internal: true}
master-webserver: {}
php-apache-1: {internal: true}
php-fpm-1: {internal: true}
mysql-1: {internal: true}
postgresql-1: {internal: true}
phpmyadmin: {internal: true}
pgadmin4: {internal: true}
python-uwsgi-1: {internal: true}
sftp-php-apache-1: {}
sftp-python-uwsgi-1: {}
ftpd-php-apache-1: {}
php-fpm-1-admin: {}

Con esto el contenedor ya se puede crear correctamente. Tiene acceso a internet pero no a los otros contenedores. Puede escribir la carpeta de la web pero no de otras webs. Ahora deberíamos agregar la configuración de Nginx para poder conectar; no expongáis el puerto de este contenedor.

Copiamos php-fpm-1.conf a php-fpm-1-admin.conf, cambiamos el nombre del servidor por ejemplo a “admin.php-fpm-1” y el fastcgi_pass a “php-fpm-1-admin:9000”.

También hay que actualizar en docker-compose.yml las redes de master-webserver, que ahora tiene que poder acceder a este nuevo equipo.

Agregamos al bloque “server” las siguientes dos líneas:

   auth_basic           "Admin";
auth_basic_user_file /etc/nginx/conf.d/php-fpm-1.htpasswd;

Esto indica a Nginx que las contraseñas estarán guardadas en la misma carpeta de configuración. Ahora toca crear el fichero. Para eso usaremos el programa “htpassword” que se instala con “apt-get install apache2-utils”:

$ touch master-webserver/sites-enabled/php-fpm-1.htpasswd
$ htpasswd master-webserver/sites-enabled/php-fpm-1.htpassw
d admin
New password:  
Re-type new password:  
Adding password for user admin

El primer comando crea el fichero si no existe. El segundo agrega un usuario llamado “admin” y nos pregunta la contraseña.

El resultado es que en el fichero ahora aparece lo siguiente:

admin:$apr1$V8vOzr1B$1dK0BkFGSJuXTLa89anLs/

Como podéis ver, la contraseña está cifrada. Solo falta actualizar /etc/hosts para incluir el nuevo dominio admin.php-fpm-1 y probar a acceder.

Si accedemos, veremos que nos pide el usuario y la contraseña:

Ya lo tenemos finalizado. Al pasar usuario “admin” y la contraseña que hayamos elegido, podremos acceder a la web. Para limitar también por IP podéis verlo en el artículo que enlacé al inicio del capítulo.

La idea aquí es que podamos actualizar la web o instalar plugins desde la misma, pero sin comprometer en seguridad. Al tener una web secundaria para esto que está bloqueada para el público pero no para el webmaster, éste puede usar esta web para cambiar el propio programa, pero un hacker no puede acceder a esta versión.

Esta autenticación debería ser sólo por HTTPS, ya que las contraseñas se transmiten en texto plano y en cada petición. Lo veremos en el próximo capítulo. El código de este capítulo lo podéis descargar desde aquí:

https://github.com/deavid/docker-lamp/tree/v0.14

HTTPS y autorenovación con LetsEncrypt

Hoy en día HTTPS es la norma. Veamos cómo configurar HTTPS con LetsEncrypt; ésto sólo funciona si tenéis dominios reales en un servidor web funcionando. Agregaremos también un crontab para la autorenovación, y todo esto en un contenedor dedicado y separado del resto.

Certificados autofirmados

Como Letsencrypt require de un dominio real servido desde la misma máquina, voy a empezar con algo más sencillo: Certificados autofirmados (self-signed). Éstos no son buena idea para webs reales, pues van a mostrar errores en el navegador de que el certificado no es seguro, pero es lo más cercano que podemos tener en un entorno de desarrollo.

La idea básica va a ser crear un certificado SSL con el comando “openssl”, montarlo como volumen en Nginx y configurarlo adecuadamente.

Como al final no sirven para mucho, no voy a complicarme con crearlo bien; nos basta con crear “algo” que funcione. Vamos a crear una carpeta ./master-webserver/certs/ y dentro vamos a crear un certificado SSL para php-fpm-1:

openssl req -x509 -nodes -days 3650 -newkey rsa:2048 -key
out php-fpm-1.key -out php-fpm-1.crt
Generating a RSA private key
.................................................................+++++
.......+++++
writing new private key to 'php-fpm-1.key'
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]:
State or Province Name (full name) [Some-State]:
Locality Name (eg, city) []:
Organization Name (eg, company) [Internet Widgits Pty Ltd]:
Organizational Unit Name (eg, section) []:
Common Name (e.g. server FQDN or YOUR name) []:php-fpm-1
Email Address []:

En el comando, lo único importante es el Common Name, que debería ser el nombre del dominio. De todas formas, como el certificado no va a ser 100% válido de todos modos, no importa demasiado.

Esto nos creará dos ficheros en la carpeta certs/, un crt y un key. El fichero key es privado y deberíais tratarlo con tanto cuidado como una contraseña. No está cifrado.

Lo siguiente es abrir el puerto 443 y configurar el volumen (solo lectura) en docker-compose.yml:

services:
master-webserver:
image: nginx:1.14
networks:
- master-webserver
- php-apache-1
- php-fpm-1
- php-fpm-1-admin
- pgadmin4
- phpmyadmin
- python-uwsgi-1
volumes:
- ./master-webserver/sites-enabled:/etc/nginx/conf.d:ro
- ./master-webserver/certs:/etc/nginx/certs:ro
- ./www:/var/www:ro
- ./master-webserver/cache:/cache
ports:
- "85:85"
- "443:443"
restart: always

Después vamos a la configuración del dominio en sites-enabled/php-fpm-1.conf y agregamos al inicio:

   listen              443 ssl http2;
ssl_certificate /etc/nginx/certs/php-fpm-1.crt;
ssl_certificate_key /etc/nginx/certs/php-fpm-1.key;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_ciphers HIGH:!aNULL:!MD5;

Con esto ya está, cuando vayamos a https://php-fpm-1 nos saldrá el aviso de que el certificado no es válido (no lo es), agregamos la excepción temporalmente y deberíamos poder ver la web.

Si tenéis el puerto 443 ocupado y queréis usar otro, simplemente cambiadlo en los tres sitios.

Letsencrypt

Para empezar, comentar que LetsEncrypt no va a funcionar en una máquina de desarrollo. Necesitamos de un dominio real en una máquina que esté sirviendo el dominio en el puerto 80 a través de nuestro master-webserver (o que haya otro reverse proxy convirtiendo el puerto 85 al 80). Por lo tanto, esta sección sólo funciona si habéis preparado con el tutorial un servidor web real.

Lo que vamos a hacer es crear un nuevo servicio llamado “letsencrypt” basado en una imagen Docker que nos haremos nosotros que llamaremos “lamp/letsencrypt” y ésta a su vez la basaremos en Debian Stretch. Sobre ésta base instalaremos certbot y montaremos tanto las webs como la carpeta de certificados (en modo lectura+escritura). Tienen que permitir escritura ambas porque certbot va a escribir un fichero de prueba en las webs para comprobar que controlamos el dominio y la de certificados porque tiene que poder escribirlos al crearlos o actualizarlos.

Agregamos el servicio a docker-compose.yml y su red:

services:
(...)
letsencrypt:
build: ./letsencrypt
image: lamp/letsencrypt
  volumes:
- ./master-webserver/certs:/certs
  - ./www:/var/www
  - ./letsencrypt-etc:/etc/letsencrypt
networks:
- letsencrypt
networks:
default: {internal: true}
(...)
letsencrypt: {}

NOTA: La carpeta /letsencrypt-etc está en la raíz en lugar de ser /letsencrypt/etc porque al arrancar la imagen, escribirá en ella como root varios ficheros que no tienen permisos de lectura y las siguientes compilaciones de la imagen fallarían, ya que Docker intenta recursivamente acceder a las carpetas de donde está el Dockerfile.

La red para letsencrypt va a permitir conexiones externas, ya que tiene que contactar con su servidor para pedir los certificados.

Creamos la carpeta y fichero “letsencrypt/Dockerfile” con el siguiente contenido:

FROM debian:stretch
RUN echo "deb http://ftp.debian.org/debian stretch-backports main" \
> /etc/apt/sources.list.d/backports.list
RUN set -ex \
&& apt-get update \
&& apt-get install -y -t stretch-backports \
certbot \
&& rm -rf /var/lib/apt/lists/*
CMD ["sleep", "99999"]

Este Dockerfile inicial descarga Debian Stretch, agrega el repositorio de backports e instala certbot. A continuación simplemente ejecutará un sleep para mantenerlo encendido sin hacer nada.

Para probar los Dockerfile, lo mejor es usar:

$ docker-compose up -d --build letsencrypt

Cuando lo tengáis funcionando, ejecutamos una consola bash dentro:

$ docker-compose exec letsencrypt bash

Esto nos dará acceso de root al contenedor. En él, comprobamos la versión de certbot:

root@3ff79f0bc160:/# certbot --version 
certbot 0.28.0

Vuestra versión puede que sea más nueva. Si es más antigua, es que no ha obtenido el paquete desde backports.

Una vez aquí, podemos empezar con la generación de los certificados de este modo:

certbot certonly --staging --webroot \
--cert-path /certs --key-path /certs
-w /var/www/php-fpm-1/ \
-d php-fpm-1.example.com -d www.php-fpm-1.com \
-w /var/www/php-apache-1/ -d php-apache-1.example.com

Varias cosas a comentar aquí:

  • Sólo funcionará si el dominio y subdominio existen y estamos sirviendolos desde nuestro Nginx en ese momento.
  • El flag “–staging” sirve para hacer pruebas. Los certificados que devuelve no son válidos. Usadla hasta que aprendáis cómo funciona y cuando lo tengáis claro la quitáis. Letsencrypt tiene unos límites y no podéis abusar. Con este flag los límites son mucho más altos.
  • Especificamos una carpeta a probar con “-w /var/www/…” y luego pasamos la lista de dominios con los que esa carpeta se tiene que comprobar.
  • Se pueden pasar más de una carpeta y más de un dominio a la vez. La recomendación es que paséis cuantos más a la vez mejor, ya que Letsencrypt intentará generar certificados que sirvan para más dominios.

Este sería un ejemplo de la salida (pero fallida, no tengo dominio donde probar ahora):

# certbot certonly --staging --webroot -w /var/www/php-fpm-1/ -d sedice.com 
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator webroot, Installer None
Enter email address (used for urgent renewal and security notices) (Enter 'c' to
cancel): deavidsedice@gmail.com

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Please read the Terms of Service at
https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf. You must
agree in order to register with the ACME server at
https://acme-staging-v02.api.letsencrypt.org/directory
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(A)gree/(C)ancel: A

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Would you be willing to share your email address with the Electronic Frontier
Foundation, a founding partner of the Let's Encrypt project and the non-profit
organization that develops Certbot? We'd like to send you email about our work
encrypting the web, EFF news, campaigns, and ways to support digital freedom.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(Y)es/(N)o: N
Obtaining a new certificate
Performing the following challenges:
http-01 challenge for sedice.com
Using the webroot path /var/www/php-fpm-1 for all unmatched domains.
Waiting for verification...
Cleaning up challenges
Failed authorization procedure. sedice.com (http-01): urn:ietf:params:acme:error:unauthorized ::
The client lacks sufficient authorization :: Invalid response from http://sedice.com/.well-known/
acme-challenge/PqF0LO40O_HjQitgv8OcAak83b3soN46yJGeNeh2ZlA: "<?xml version=\"1.0\" encoding=\"iso
-8859-1\"?>\n<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\"\n         \"http://
www."

IMPORTANT NOTES:
- The following errors were reported by the server:

  Domain: sedice.com
  Type:   unauthorized
  Detail: Invalid response from
  http://sedice.com/.well-known/acme-challenge/PqF0LO40O_HjQitgv8OcAak83b3soN46yJGeNeh2ZlA:
  "<?xml version=\"1.0\" encoding=\"iso-8859-1\"?>\n<!DOCTYPE html
  PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\"\n
  \"http://www."

  To fix these errors, please make sure that your domain name was
  entered correctly and the DNS A/AAAA record(s) for that domain
  contain(s) the right IP address.
- Your account credentials have been saved in your Certbot
  configuration directory at /etc/letsencrypt. You should make a
  secure backup of this folder now. This configuration directory will
  also contain certificates and private keys obtained by Certbot so
  making regular backups of this folder is ideal.

Esto nos va a crear ficheros xxx.pem en master-webserver/certs. Si no fuese el caso, revisad dónde los crea y cambiad el punto de montaje. Vais a tener uno para el certificado y otro para la clave privada. La configuración en Nginx es exactamente igual que en los certificados autofirmados.

Cada vez que necesitéis un certificado nuevo haréis lo mismo, ejecutaréis “bash” en el contenedor y usando la línea de comandos solicitáis el nuevo certificado.

Con esto, ya deberíais tener un SSL funcionando perfectamente sin errores de validación de certificados. Pero hay un pequeño problema que no debe pasarnos por alto: Los certificados de Letsencrypt deben ser renovados cada tres meses sin falta, o las webs empezarán a fallar.

Para renovarlos es tan sencillo como ejecutar en el contenedor:

$ certbot renew

Y el programa se encargará de detectar los que están a punto de caducar, renovarlos y sobreescribir los certificados con la nueva versión. Esto es posible gracias a los datos que guarda en /etc/letsencrypt, y por ese motivo en los pasos anteriores hemos montado la carpeta en el disco duro, para que si la imagen hay que recompilarla o pasa cualquier cosa, estos datos no se pierdan.

Pero es un poco engorro tener que acordarse cada tres meses de entrar y renovar los certificados. Si habéis configurado una dirección de correo válida recibiréis alertas; pero lo suyo sería que se autorenueven.

Lo bueno es que “certbot renew” se encarga de todo, incluso de cuando hay que intentar la renovación o cuando no hace falta. Por lo que ejecutándolo en un cron todos los días sería suficiente. Pero en Docker no tenemos cron, así que, ¿qué hacemos?

Una solución muy sencilla es crear un pequeño programa de bash que ejecute un bucle infinito con “certbot renew” y un “sleep” de un día. Rudimentario, pero más que suficiente.

Creemos un fichero ./letsencrypt/renew.sh que contenga lo siguiente:

#!/bin/bash
while true
do
certbot renew
sleep 48h
done

Y en el Dockerfile de la misma carpeta cambiamos el comando del CMD y agregamos antes un comando COPY:

COPY renew.sh /

CMD ["bash", "/renew.sh"]

Básicamente le estamos diciendo que al arrancar el contenedor ejecute renew.sh; éste se ejecutará indefinidamente. Al iniciar, intentará una renovación y luego se dormirá 48 horas, para volver a intentarlo de nuevo. He puesto 48 horas pero puede ser cualquier valor. Sencillamente no creo que sea necesario intentarlo cada día. En Sedice.com lo tengo en un crontab cada semana y funciona bien.

Al iniciar de nuevo y ver los logs se aprecia cómo se ejecuta correctamente:

Con esto terminamos. El código está en GitHub:

https://github.com/deavid/docker-lamp/tree/v0.15

Activar TLS a la FTP

Ahora que ya tenemos las webs en HTTPS, la autenticación por HttpAuth que hicimos antes pasa a ser mucho más segura (eliminad el acceso por HTTP/1.1 para más seguridad). Pero la FTP sigue transmitiendo las contraseñas en texto plano y esto es ahora mismo el mayor riesgo de seguridad que tenemos.

Como ya he indicado muchas veces, lo mejor es eliminar las FTP por completo y pasar a usar SFTP (a través de SSH). De hecho es lo más sencillo.

Para el caso de las FTP, hay una variante llamada FTPS que agrega TLS al protocolo. Esto impide que las contraseñas pasen por texto plano sino cifradas.

Pero, ya aviso: Normalmente la mayoría de clientes FTP que la gente usa soporta SFTP (SSH), como por ejemplo FileZilla. Si realmente necesitamos FTP, probablemente es porque el software que necesita conectar no tiene soporte de SFTP. Y si es el caso, lo más normal es que tampoco tenga soporte de FTPS, por lo que describo aquí seguramente sea una pérdida de tiempo.

Básicamente para activar el soporte TLS, hay una opción “–tls” que puede tener tres valores:

  • –tls=1: Activar soporte TLS sólo opcionalmente
  • –tls=2: Activar TLS obligatoriamente sólo para el login
  • –tls=3: Activar TLS también para las conexiones de datos

En este ejemplo voy a usar –tls=3 que es el más seguro.

Vamos a necesitar un certificado para TLS. Para empezar, lo más sencillo es que la propia imagen se genere el certificado al arrancar. Para ello lo configuramos así:

ftpd-php-apache-1:
image: stilliard/pure-ftpd
networks:
- ftpd-php-apache-1
ports:
- "2101:21"
- "21010-21019:21010-21019"
volumes:
- ./www/php-apache-1:/var/www/html/
environment:
- PUBLICHOST=localhost # Replace this with your server public IP
- FTP_PASSIVE_PORTS=21010:21019
- FTP_USER_HOME=/var/www/html/
- FTP_USER_NAME=admin
- FTP_USER_PASS=${FTP_PHP_APACHE_1_PASS-}
- ADDED_FLAGS=--tls=3
- TLS_USE_DSAPRAM=true
- TLS_CN=php-apache-1
- TLS_ORG=lamp
- TLS_C=ES

Esto generará un certificado en cada arranque, no muy seguro (por el uso de USE_DSAPRAM), pero que cifrará la conexión. Es mejor que nada.

Si queremos usar nuestros certificados, necesitamos juntar el certificado y la clave pública en un sólo fichero:

cat php-fpm-1.crt php-fpm-1.key > php-fpm-1.both.pem

Y ahora hay que montar este fichero en el path /etc/ssl/private/pure-ftpd.pem; una vez hecho esto, las cuatro últimas variables de entorno (TLS_*) ya no son necesarias. Por lo que queda así:

ftpd-php-apache-1:
image: stilliard/pure-ftpd
networks:
- ftpd-php-apache-1
ports:
- "2101:21"
- "21010-21019:21010-21019"
volumes:
- ./www/php-apache-1:/var/www/html/
- ./master-webserver/certs/php-fpm-1.both.pem:/etc/ssl/private/pure-ftpd.pem
environment:
- PUBLICHOST=localhost # Replace this with your server public IP
- FTP_PASSIVE_PORTS=21010:21019
- FTP_USER_HOME=/var/www/html/
- FTP_USER_NAME=admin
- FTP_USER_PASS=${FTP_PHP_APACHE_1_PASS-}
- ADDED_FLAGS=--tls=3

Y ya está. No tiene más misterio. Los certificados de letsencrypt se pueden juntar en uno solo de la misma forma independientemente de si son *.crt o *.pem; simplemente hay que concatenarlos.

El código para FTPS lo tenéis aquí:

https://github.com/deavid/docker-lamp/tree/v0.16

Servidor de emails (relay)

Para enviar emails y no exponer nuestros contenedores de aplicación a internet, es más seguro crear un servidor de emails tipo relay, que simplemente reenvía todos los correos entrantes.

Normalmente no llevan contraseña, y pensándolo bien, si el contenedor tiene la contraseña es casi lo mismo que sin contraseña; siempre y cuando el relay sólo sea accesible desde un contenedor. Esto lo hemos visto antes en este mismo artículo.

La ventaja de montar así el correo, es que el usuario, password y host SMTP están fuera del contenedor de aplicación, por lo que no puede robar las credenciales. Y si quiere enviar correo, tiene que pasar por nuestro relay.

En este relay, más adelante se podrán agregar técnicas antispam para detectar usos maliciosos y detener los servicios o descartar correos. Como crearemos un contenedor de emails por cada web, si hay problemas con uno y lo detenemos, no vamos a afectar al resto de webs, ni mucho menos las cuentas de correo de usuarios con Thunderbird/Outlook/etc.

He intentado realizarlo con Postfix, pero este servidor de emails espera un sistema operativo funcionando, por lo que aunque es posible, es tremendamente laborioso. Al final me he decantado por una imagen en Docker que ya existe y funciona, está basada en Debian Jessie y Exim4:

https://hub.docker.com/r/namshi/smtp/

Después de inspeccionar su script de entrypoint me he quedado sorprendido lo fácil que configura y arranca Exim4, aún sin tener ni idea de cómo funciona éste servidor de correo. Postfix tiene una configuración más sencilla tal vez, pero el arranque dentro del contenedor es una locura.

Simplemente agregamos el bloque para el nuevo servicio en docker-compose.yml:

email-relay:
image: namshi/smtp
networks:
- email-relay
environment:
- SMARTHOST_ADDRESS=mail.mysmtp.com
- SMARTHOST_PORT=587
- SMARTHOST_USER=myuser
- SMARTHOST_PASSWORD=secret
- SMARTHOST_ALIASES=*.mysmtp.com
networks:
default: {internal: true}
(...)
email-relay: {}

Luego esta nueva red la agregáis al contenedor que queráis, y ya tiene sistema de correo. Para enviar correos desde él, tiene que ser vía SMTP al puerto 25, sin usuario ni contraseña.

Cada vez que este contenedor reciba un correo, lo reenviará al servidor de correo que habéis indicado en las variables de entorno. Ese servidor será el encargado de enviar definitivamente el correo.

Hay que tener en cuanta que los servidores relay son inseguros, porque cualquiera que pueda conectar a ellos podrá enviar correo. Es muy importante que no expongáis el puerto y que limitéis el acceso sólo al contenedor que tiene que enviar los emails.

Las contraseñas en este caso aparecen en docker-compose.yml lo cual no es bueno. Podéis pasarlas desde la shell y quitarlas de ahí; o en un fichero. Docker Compose permite muchas formas de cargar variables.

La imagen está basada en Debian Jessie en lugar de Stretch. Esto causa que hayan unos 60Mb adicionales en disco porque no está basada de la misma que usamos nosotros. La imagen parece sencilla de personalizar y se podría fácilmente hacer una nuestra sólo con soporte de Smarthost y más tarde agregar SpamAssasin u otros sistemas antispam. Yo de momento lo voy a dejar aquí.

El código para este capítulo lo tenéis aquí:

https://github.com/deavid/docker-lamp/tree/v0.17

Conlusión

Con esta entrega terminamos la serie de artículos de cómo montar un servidor LAMP super seguro con Docker. El resultado es que tenemos todos los servicios encapsulados en diferentes contenedores, con los permisos restringidos y la red limitada al máximo.

Si hemos configurado todo correctamente; no sólo copiando los ejemplos, sino entendiendolos y adaptándolos apropiadamente a nuestra situación, deberíamos tener un servicio que aunque sea vulnerable, el alcance de un posible hacker estará tremendamente limitado.

Uno de los factores que más debería preocupar es la existencia de las FTP y SFTP. Una contraseña robada podría permitir al atacante escribir cualquier cosa y ejecutar código en los contenedores. Aunque no pueda llegar muy lejos, va a poder usar esto para robar otras credenciales de otros usuarios y en caso de e-commerce podría robar cosas más sensibles como son tarjetas de crédito si la web trabaja con ellas.

Para evitar esto, los contenedores de FTP y SFTP deberían estar normalmente apagados (o con el puerto sin enrutar). Pero esto no es posible. Lo ideal sería poder tener una doble autenticación o tokens de un solo uso; pero esto se escapa del alcance de este tutorial. Si algún día tengo tiempo para desarrollarlo, tened por seguro que haré un artículo al respecto.

Por lo demás, espero que os haya gustado y os sea útil. Nos seguimos leyendo!