Salta al contenuto principale


When you optimize a Crystal program, pay attention to language features that inline code. For example, pay attention to how you use blocks (consequences here and here).

Also pay attention to how you use macros. Macros, like ECR.embed and Slang.embed, inline code at the point where they are invoked. This can be powerful, because macros actually generate code—but, ten invocations later, you have ten copies of the code.

Here's a case of too many copies, but with a very happy ending...

Ktistec uses both ECR.embed and Slang.embed to generate web pages from views and partials. I wrote code to count the number of places Ktistec used embed for each view and partial it renders. There's a long tail, but here are the big ones:
| src/views/layouts/default.html.ecr | 204 |
| src/views/partials/modals.html.slang | 204 |
| src/views/partials/header.html.slang | 204 |
| src/views/partials/footer.html.slang | 204 |
| src/views/pages/generic.html.slang | 155 |
| src/views/partials/object/label.html.slang | 36 |
| src/views/partials/object/content.html.slang | 36 |
| src/views/partials/collection.json.ecr | 28 |
| src/views/partials/thread.html.slang | 12 |
| src/views/partials/detail.html.slang | 12 |
| src/views/partials/object.html.slang | 12 |
| src/views/partials/actor-panel.html.slang | 11 |
| src/views/partials/object.json.ecr | 11 |
| src/views/partials/paginator.html.slang | 11 |
| src/views/objects/thread.json.ecr | 8 |
| src/views/partials/activity/label.html.slang | 6 |
| src/views/mentions/index.json.ecr | 6 |
| src/views/remote_follows/index.json.ecr | 6 |
| src/views/settings/settings.json.ecr | 6 |
| src/views/tags/index.json.ecr | 6 |
| src/views/activities/activity.json.ecr | 5 |
| src/views/partials/editor.html.slang | 5 |
| src/views/objects/object.json.ecr | 5 |
| src/views/actors/remote.json.ecr | 4 |
...
The layout is part of every page and is rendered with every view, so lots of copies. Every page has a header and a footer (and some default modal dialogs) so you get those, too. The generic view is a little less obvious. It's used to render pages for which there is no more specific view—typically pages served for 400 Bad Request or 401 Unauthorized. Objects (posts) are rendered in a variety of contexts, so it's no surprise label.html.slang and content.html.slang are popular.

ECR.embed and Slang.embed inline templates at the point where they are invoked, but beyond that they don't really customize the generated code—they just duplicate it. What we want is one function for each view or partial, which wraps embed and returns JSON or HTML.

Those changes mostly occur in commits from 399287cf to 4b025f50. To say that they made a huge difference is a gross understatement. Executable size decreased by ~13%. Build time decreased by ~50%, and the memory required to build decreased by ~30%.

#ktistec #crystallang #optimization


the crystal programming language always inlines blocks, which is great for performance but trades off space for speed. using blocks effectively means keeping this in mind.

somewhere along the line, i learned the habit of passing a block to a function as a means of customizing the behavior of the function. if the function that takes the block is large, it's important to remember that the body of the function is inlined where the function is called, which may not be what you are expecting. if you call the function multiple times, you even get multiple copies.

i just committed code that fixes an egregious example of this problem. in this case these ~30 lines of code replace the blocks with procs (which aren't inlined) and cut ~24mb (that's megabytes) off the executable (over a third of its size).

i regularly shoot myself in the foot trying to be clever, so i don' t know how prevalent this problem is in practice, but it's definitely something to keep in mind, especially if you see compile times and executable sizes growing!

#crystallang #ktistec