Taking a stab at making a privacy-preserving analytics thing for my blog.
I'm hashing the user's IP and User-Agent, so they're never stored directly. Not 100% accurate, but good enough for me to work out whether a page-view is unique or not.
Country is very rough - using GeoIP.
Referer is either from HTTP or if the linking site uses UTM.
Seems *broadly* accurate compared to other plugins.
I'm not sure of the performance if this hits a million rows. Might need further DB tables.
reshared this
maco
in reply to Terence Eden • • •in case you have any interest in existing privacy-friendly analytics solutions, that’s Plausible Analytics’ goal @plausible
plausible.io/about
About Plausible Analytics
Plausible AnalyticsTerence Eden
in reply to maco • • •Thanks. But it is much more fun to build my own 😄
Fazal Majid
in reply to Terence Eden • • •if performance is your concern, DuckDB is incredibly fast. I had Claude write me an AWS Cloudfront to Parquet 1 converter with IPinfo lite MMDB lookups, which you can then import into a DuckDB. It takes 41 milliseconds to find the top 10 countries over 8.8M rows, about 5 years' worth of logs:
```
zulfiqar ~/lognami>duckdb blog.duckdb
DuckDB v1.3.2 (Ossivalis) 15004e53
Enter ".help" for usage hints.
D .timer on
D select count(*) from web;
┌────────────────┐
│ count_star() │
│ int64 │
├────────────────┤
│ 8820061 │
│ (8.82 million) │
└────────────────┘
Run Time (s): real 0.008 user 0.006334 sys 0.000867
D select country, count(*) from web group by 1 order by 2 desc limit 10;
┌─────────┬──────────────┐
│ country │ count_star() │
│ varchar │ int64 │
├─────────┼──────────────┤
│ US │ 3942963 │
│ FR │ 727561 │
│ DE │ 570468 │
│ CN │ 456912 │
│ NL │ 418870 │
│ SG │ 408658 │
│ GB │ 279144 │
│ CA │ 218756 │
│ IN │ 175326 │
│ RU │ 157289 │
├─────────┴──────────────┤
│ 10 rows 2 columns │
└────────────────────────┘
Run Time (s): real 0.041 user 0.181591 sys 0.648902
```
Terence Eden
in reply to Terence Eden • • •SELECT country, COUNT(*) AS count
FROM wp_edent_stats
WHERE DATE(time) = '2025-09-05'
GROUP BY country
ORDER BY count DESC;
Now, what's the SQL query to turn these into emoji flags?
Christopher Isene
in reply to Terence Eden • • •Christopher Isene
in reply to Christopher Isene • • •Terence Eden
in reply to Christopher Isene • • •@cisene Not necessarily.
The emoji flags are made up of two-character codes.
So the flag 🇬🇧 is literally 🇬 🇧
What I'm wondering is if I can turn a returned GB into those symbols.
Christopher Isene
in reply to Terence Eden • • •Emily_S
in reply to Terence Eden • • •@cisene I'm going to be cranky with the Unicode folk if adding an offset to the characters of a two letter country code doesn't yield the characters to make up the flag 😂
Bring back the old ways, convert numbers to characters by adding 48 to them.
Terence Eden
in reply to Emily_S • • •Christopher Isene
in reply to Terence Eden • • •Terence Eden
in reply to Christopher Isene • • •That's not how it works.
The flags are the ISO country codes.
If a new country of KK were created, the flag would be 🇰🇰 which is 🇰 🇰
Emily_S
in reply to Terence Eden • • •Kris
in reply to Terence Eden • • •I apologize for the JSON abuse, but that was easiest.
Olivier Mehani
in reply to Terence Eden • • •So, here's some zeitgeist to put together:
Unicode flags are just the combination of the regional indicator symbols compart.com/en/unicode/U+1F1E6, so 🇫+ 🇷 gives 🇫🇷
And framapiaf.org/@sylvestre/11513…
So maybe you could... `| cargo run tr A-Z 🇦-🇿 `
Unicode
https://www.compart.com/en/unicode/U+1F1E6Sylvestre
2025-09-02 13:37:47
Terence Eden
in reply to Terence Eden • • •Getting there! I can now (inefficiently) see how many page views and (roughly) how many unique users have visited my blog.
I can also see where people are coming from on a specific day.
(Remember, this is just for me to noodle about with. This isn't going to be an amazingly accurate, professional-grade plugin.)
Terence Eden
in reply to Terence Eden • • •These are the countries which send the most visitors to my blog.
I'm using local GeoIP database - so I don't have to send anyone's IP address to a 3rd party. Also keeps costs at zero!
Terence Eden reshared this.
Terence Eden
in reply to Terence Eden • • •I want to draw a graph of visitors.
I don't want to call a 3rd party graphing service.
I don't want to use a big library from someone else.
I am not going to write my own graphics generator.
So I think I'm going to build a Unicode sparkline generator for PHP.
▁▂▃▄▅▆▇█
⅛¼⅜½⅝¾⅞1
DamonHD
in reply to Terence Eden • • •An approach that I use that you may hate is in the accordians in this page, using HTML5 meter elements:
earth.org.uk/energy-series-dat…
16WW Energy Series Dataset
www.earth.org.ukTerence Eden
in reply to DamonHD • • •I'm rather more used to vertical lines.
Hmmm. I wonder if they can be rotated?
DamonHD
in reply to Terence Eden • • •Terence Eden
in reply to DamonHD • • •@DamonHD meter {transform: rotate(-90deg);}
Works!
Thanks for the suggestion 🙂
DamonHD
in reply to Terence Eden • • •Hurrah!
(Please send me 1% from your second USD billion of consequential profits...)
Marius Gundersen - mdg 🌻
in reply to Terence Eden • • •@DamonHD have you tried setting the writing-mode to vertical-lr?
codepen.io/mariusgundersen/pen…
Terence Eden
in reply to Marius Gundersen - mdg 🌻 • • •@gundersen @DamonHD
Nice!
I am hereby awarding you 1 EdentCoin™.
Hugo Mills
in reply to Terence Eden • • •There's already a command-line one of those I found a few years ago:
github.com/deeplook/sparklines
GitHub - deeplook/sparklines: Text-based sparklines for the command line mimicking those of Edward Tufte.
GitHubmORA
in reply to Terence Eden • • •Terence Eden
in reply to mORA • • •Terence Eden
in reply to Terence Eden • • •Bit of CSS to rotate it, table for layout, job's a goodun!
DamonHD
in reply to Terence Eden • • •TABLE FOR LAYOUT!!!!!
I disown you. I would never* do such a thing.
*Except when I do.
Terence Eden
in reply to DamonHD • • •But, to be fair, it *is* tabular data - so I think I can get away with it 😆
DamonHD
in reply to Terence Eden • • •DamonHD
in reply to Terence Eden • • •Terence Eden
in reply to Terence Eden • • •Terence Eden
in reply to Terence Eden • • •Might have gone a *little* overboard! I rather like this though.
Self-hosted analytics showing a graph of the last 7 days.
Today's top referers, pages, and countries.
Super simple and suitable for my needs.
Next up - import historic data and work out if I want to keep countries & referers in it.
#SelfHosting #WordPress #SmolWeb
Eric Goodwin
in reply to Terence Eden • • •Owen Blacker
in reply to Terence Eden • • •Terence Eden
in reply to Owen Blacker • • •Owen Blacker
in reply to Terence Eden • • •Terence Eden
in reply to Terence Eden • • •I'm basically farting around with the design because I don't want to do the hard work of periodically consolidating the database.
I *think* for now I just need a separate table of:
Date, Views, Users.
I can increment the Views every time the page is visited using something like `ON DUPLICATE KEY UPDATE visits = visits + 1;`
But deduplicating the users will be harder. I don't want to run a daily cron job for this.
Terence Eden
in reply to Terence Eden • • •Interestingly, I've started seeing hits come in from `noai.duckduckgo.com`
Small numbers for now, but wonder if they'll grow? I'll probably consolidate it to the 2nd level for my stats. Not sure there's much useful I can do with the information.
data af
in reply to Terence Eden • • •information and granularity overload can make for a challenging decision making environment. At my time in the Civil Service our web teams wanted *so many* metrics, but ultimately only listened to the user researchers. Can’t put detailed 30 minute chats into a PowerBI dashboard though.
I suppose you could write anti-AI content to juice those numbers.
Terence Eden
in reply to Terence Eden • • •I think I've cracked the problem - MySQL makes my brain hurt.
Table 1: Logs *everything*. Date, post ID, user, etc.
Table 2: Is date, view count, user count.
When (1) is updated, an AFTER INSERT trigger increments the view count for that day.
It also counts the number of unique users for that day and updates (2).
Table 3 is the same as (2) but on a per-post basis. It also updates after insert.
That way, I don't have to use logic in my PHP app - all it does is fire an insert.
Ben Curthoys
in reply to Terence Eden • • •if you are updating a "user count" in a trigger, then every single insert is going to be in contention of a lock on that row, and you will get massive concurrency problems that bottleneck your throughput.
My advice if there is even the slightest chance this will scale beyond one blog would be to do the write without any triggers, and the calculate the total per day when you need it, or every 5 minutes, or something.
Terence Eden
in reply to Ben Curthoys • • •@bencurthoys This is only for my blog. While it does occasionally get popular, it's usually less than 2,000 visits per day.
Is there an equivalent of "cron" for MySQL / MariaDB? I'd rather not rely on WordPress's cron because I've found it a bit fickle.
Ben Curthoys
in reply to Terence Eden • • •yeah but you know you're going to open source it and 5 minutes later someone will be using it for something crazy.
Anyway my experience is not MySql specific, but in any RDBMS it will be way more sensible to
SELECT COUNT(DISTINCT User ID) FROM views WHERE timestamp BETWEEN startofday AND endofday
than to try to maintain totals - as long as you index it properly.
Ben Curthoys
in reply to Ben Curthoys • • •Terence Eden
in reply to Ben Curthoys • • •Chris Ellis
in reply to Terence Eden • • •be careful with 'UPDATE count = count + 1" patterns. They're unlikely to accurately count as you expect due to concurrency issues. Especially in MVCC systems, it will depend alot on transaction isolation levels.
For this kind of thing, just logging the visits as records and aggregating on analysis / periodically is often the better approach. Removes a lot of the concurrent write headache and give you more flexibility.
Terence Eden
in reply to Chris Ellis • • •Chris Ellis
in reply to Terence Eden • • •not super familiar with Maria/MySQL, more a PostgreSQL person.
But maybe: mariadb.com/docs/server/server…
Events Overview | MariaDB Documentation
mariadb.comChris Ellis
in reply to Chris Ellis • • •Chris Ellis
in reply to Chris Ellis • • •Terence Eden
in reply to Chris Ellis • • •Cat West
in reply to Terence Eden • • •Terence Eden
in reply to Terence Eden • • •On the advice of many people here, I'm looking at using EVENT rather than TRIGGER.
Looks like I need to raise a ticket with my host to see if this is possible. I'd rather not use a cron job.
⊥ᵒᵚ⁄Cᵸᵎᶺᵋᶫ∸ᵒᵘ ☑️
in reply to Terence Eden • • •Terence Eden
in reply to Terence Eden • • •Arse. My host doesn't support EVENTs on my cheap plan.
Fair enough. Looks like I'll have to use cron.
Wonder whether it's better to have a stored routine, or just keep the logic in the PHP?
My heart says use the stored routine - but it's probably more flexible and fixable to have the SQL in the PHP.
PJ Evans
in reply to Terence Eden • • •Terence Eden
in reply to Terence Eden • • •Terence Eden
in reply to Terence Eden • • •My self-written stats plugin for WordPress seemed to cope with a small burst of traffic from HackerNews.
Blog post later today. Visit shkspr.mobi/blog/ to subscribe by RSS or Email.
Also available on the fediverse as @blog
Terence Eden’s Blog
Terence Eden’s BlogTerence Eden
in reply to Terence Eden • • •And I can now see exactly which websites send traffic to which post!
Hello all you lovely Fediverse friends 🙂
Blog post at shkspr.mobi/blog/2025/09/reaso…
Terence Eden
in reply to Terence Eden • • •Tidied up the control panel.
I can now click on a single post and see how many times it has been viewed, how many unique users visited it, which countries they came from, and what referers sent traffic to that page.
So far, it is *broadly* accurate compared to my 3rd party plugin. It over-counts users and under-counts page views. But that's fine.
Next steps: draw some pretty graphs. Ideally with weekly / monthly / yearly groupings.
Terence Eden
in reply to Terence Eden • • •Who needs a fancy graphing library?
SVG polylines are easy enough to generate with code!
Needs a bit of tidying up and looks rubbish when more than 100 points. So will need to group by week, month, and year.
Pretty happy with that so far.
Terence Eden reshared this.
Zoe
in reply to Terence Eden • • •Terence Eden
in reply to Zoe • • •Brilliant idea 😉
Terence Eden
in reply to Terence Eden • • •Terence Eden
in reply to Terence Eden • • •I can now group views by year, month, week.
Writing a basic graphing library is fun!
Terence Eden
in reply to Terence Eden • • •Web design is my passion!
Ok, first make it work *then* make it look good 😁
Terence Eden reshared this.
Christopher Isene
in reply to Terence Eden • • •DamonHD
in reply to Terence Eden • • •jack
in reply to Terence Eden • • •I stopped wearing that smartwatch because it was so clunky compared to a Casio F-91W on a nylon strap
if it was open-source/jailbroken and I could make it, y'know, show the data I want to see, however...
Terence Eden
in reply to jack • • •@jackeric it works perfectly with the Open Source app gadgetbridge.org/
I can extract all my data - sleep, heart rate, steps, etc.
I also swapped the strap to something slightly nicer.
Gadgetbridge
gadgetbridge.orgjack
in reply to Terence Eden • • •Terence Eden
in reply to jack • • •I've emailed both the manufacturer and OEM to see if I can get the SDK. No luck yet.
You can upload your own photo to one of the faces. Not exactly LCARS, but a start 🙂
jack
in reply to Terence Eden • • •Marius Gundersen - mdg 🌻
in reply to Terence Eden • • •tip: use a class-less stylesheet that just makes things look nice without you having to add any classes to the markup. For example picocss
picocss.com/docs/classless
Class-less version • Pico CSS
Pico CSSTerence Eden
in reply to Terence Eden • • •Terence Eden
in reply to Terence Eden • • •My CSS breaks but only when I look at my all-time stats.
Which is a nice problem to have, I guess.
I don't want to wrap the views, nor do I want to increase the column width. I guess font scaling?
Matt Round
in reply to Terence Eden • • •Terence Eden
in reply to Matt Round • • •Jack
in reply to Terence Eden • • •Terence Eden
in reply to Terence Eden • • •On the suggestion of @mattround I've combined the dates into one box.
Testing shows it should be good until I breach 100 million views. Which will be a problem for 2078's me!
discontinuity
in reply to Terence Eden • • •Terence Eden
in reply to Terence Eden • • •Nice to see the Lemmyverse (or whatever we're calling it) start to send traffic to my blog.
As every with federated social media, the individual number of referers is quite low. But together, when added up, give it a bit more heft.
Terence Eden reshared this.
db0
in reply to Terence Eden • • •Terence Eden
in reply to db0 • • •Timo Tijhof
in reply to Terence Eden • • •Terence Eden
in reply to Timo Tijhof • • •@krinkle well spotted!
Blog post coming later, but you can see some details at:
mastodon.social/@Edent/1151688…
Terence Eden
2025-09-08 13:00:39
Glenjamin
in reply to Terence Eden • • •Terence Eden
in reply to Glenjamin • • •Glenjamin
in reply to Terence Eden • • •Terence Eden
in reply to Glenjamin • • •@glenjamin Nice, thanks. I'll take a look.
TBH, I'm not *overly* worried about performance at the moment.
In the last week(ish) I've only generated 16,846 rows.
I expect roughly 500k per year.
Marius Gundersen - mdg 🌻
in reply to Terence Eden • • •I've used charts.css in a recent backoffice project, where you create a table using html templating language and then it's styled like a bar chart. Very handy!
chartscss.org/
Charts.css
chartscss.orgKevin Marks
in reply to Terence Eden • • •Post: kevinmarks.com/svgsparklines.h…
code: view-source:kevinmarks.com/sparkline.svg
SVG Sparklines
www.kevinmarks.comant
in reply to Terence Eden • • •Terence Eden
in reply to ant • • •Free updated GeoIP legacy databases
mailfud.orgLeonardo reshared this.
Leonardo
in reply to Terence Eden • • •Paul
in reply to Terence Eden • • •A million rows shouldn't cause any problems if you have sensible indexes, but you might want to rewrite:
DATE(time) = '2025-09-05'
as:
time >= '2025-09-05' AND time < '2025-09-06'
because DATE(time) *might* cause any indexes on time to be ignored, since you're transforming the column to something other than what was indexed (I had a client a few years ago with a really slow report because of this)
Terence Eden
in reply to Paul • • •hades
in reply to Terence Eden • • •Terence Eden
in reply to hades • • •@hades I 100% disagree.
I can't track anything off my site.
I'll buy you a beer if you can reverse any of the ones I posted. I'll even tell you the algorithm if you'd like?
hades
in reply to Terence Eden • • •Terence Eden
in reply to hades • • •If you have left my beach and I notice your footprints, has that invaded your privacy?
Terence Eden
Unknown parent • • •Ben Curthoys
in reply to Terence Eden • • •Are you familiar with plausible.io/about ? @plausible
They might be already doing what you're trying to do...
About Plausible Analytics
Plausible AnalyticsTerence Eden
in reply to Ben Curthoys • • •@bencurthoys @plausible
Yes, but I'm more interested in building something that I can maintain and understand.
Learning is more fun than just installing.