Call for speakers is open! Check out the CFP
Speaker:
Chris Ellis
Nexteam
Chris is a computing and electronics geek, who loves working with Open Source software. He is a jack of all trades consultant usually spending his time engineering systems with PostgreSQL, Java and Linux. Now and then he gets time to design some electronics often involving ESP32s, Raspberry Pi CM4s and LED badges.
POSETTE 2024
Talk
Even JSONB In Postgres Needs Schemas
PostgreSQL offers great support for JSON, which has been a game changer for me in a number of projects and use cases, offering easy solutions for those times where you really need to store complex structures, or the unknowns.
However often you still need to be in control of what is being stored, or at least requiring some shape at minimum.
Thankfully PostgreSQL has a great range of features for ensuring data integrity and making our future lives easier.
We'll take a brief overview of PostgreSQL's JSON support, along with some practical and real world examples of when and how you can use it.
Then we'll look at how you can control those unknown unknowns with JSON Schema and CHECK constraints.
Speaker
Interview
About the Speaker
-
Tell us about yourself: career, family, passions
So, I’m Chris, I grew up near Wolverhampton on the Shropshire border, not too far from The Iron Bridge. These days I have been living in London for the last few years. Running Nexteam, a software consulting company along with some friends.
I’m a computing jack of all trades, master of none, but often better than a master of one.
My knowledge and skills go properly full stack from hardware to frontend. PostgreSQL, Devops, Java are my strongest areas, but I’m just as comfortable fiddling with Ceph clusters, networking and web frontends.
I often find the problem I’m solving more interesting than the specific technology I’m using, as long as it’s Open Source.
-
What is your icebreaker for PostgreSQL events?
About 2 pints :). I'm pretty terrible at this, I probably should go do an elevator pitch course at some point. Thankfully these days I know a fair number of people already and being a speaker now and then often means people come and talk to you about your talk.
-
How do you prepare for an online presentation?
This will only be my second time and it feels a bit different from the first one I did. I'm thankful I'm not in charge of recording it myself, since I'll be forced to limit the number of takes somewhat and have someone else to edit it a little bit.
The parts I found the strangest is listening back to your own talk when it streams, but hopefully there will be enough interaction going on to distract me from listening to myself.
-
Which book are you reading right now?
I've almost finished reading 'Slide Rule', an autobiography of Nevil Shute. He was an aeronautical engineer in the early days of aviation, working on airships in the 1920s before going on to set up his own aircraft manufacturer in the 1930s. He might be best known for his novel 'On the Beach', which was also turned into a film. It's been an interesting read, it's certainly interesting to see that 100 years later, we're still no better at running complex projects.
-
What is your favorite hobby?
I spend a fair amount of my time building things, a bit of a random selection of electronic things like my Electric Elephants. Otherwise I might be listening to some Vinyl on my Hifi, or in the pub with friends. During the summer you'll often find me enjoying the slow pace of a County Cricket match down at The Oval.
About the Talk
-
What will your talk be about, exactly? Why this topic?
Over the years since PostgreSQL got really good support for JSON with JSONB, I've found myself using it more and more for various use cases. Sometimes when it's complex data structures which are not worth being fully mapped into tables. Or when you're dealing with those unknown unknowns and you've just got to deal with storing stuff that you have less control over.
However none of us want to be dealing with those messy situations which arise from not controlling what has actually been stored in your database. PostgreSQL provides so many features for you to be able to enforce the integrity of your data.
Yet most people overlook this when dealing with JSON that they are storing. This came up in a recent project I've been leading. We're storing a fair bit of less important data in a JSONB column, but we still want to enforce some basic rules at minimum around what is being stored.
-
What is the audience for your talk?
It's probably mostly focused towards application developers, however it might be of use to DBAs who deal with application developers on similar problem domains.
-
What existing knowledge should the attendee have?
A basic understanding of PostgreSQL should be enough. I'll start the talk by going over the JSON capabilities of PostgreSQL. Explaining some use cases and some of the rationale which lead to this talk. Before moving on to explain about CHECK constraints and how we can use them to enforce structure of what we store in JSONB. Finally taking a look at some more advanced extensions we can use.
-
Which other talk at this year’s conference would you like to watch?
I'm quite interested to listen to How to Work with Other People by Jimmy Angelakos and Floor Drees.
-
How do you balance technical depth with engaging storytelling in your conference presentations?
I think that making a presentation engaging and well structured is often more important than the technical content, no one remembers boring talks. I try to focus my talks around real world experience and spend time trying to get them to flow and tell a story. That I can then try to layer the technical information into. I'm a believer that talks should inspire you at what to look into and research more.
About PostgreSQL
-
What inspired you to work with PostgreSQL?
People should listen to my Path To Citus Con episode for more. It was a mix of things which started me going down the rabbit hole, but mainly working on a full text search project back around 2008. I will say I've not yet found the White Knight, or Red Queen, but remember what the dormouse said: Feed your head.
-
What is your favorite PostgreSQL feature?
The Community! I find the PostgreSQL community to be friendly and helpful. I think the fact PostgreSQL is an actual project, with multiple competitors collaborating together to be hugely understated but very important.
PostgreSQL has so many cool technical features and my favourite tends to change project by project. Its extensibility is probably my favourite technical feature, but I see that as something which has happened due to it being a community led project.
-
What is the single thing that you think differentiates PostgreSQL most from other databases?
I think the core thing which PostgreSQL got right (very early on) was the plugability of data types and functions. It's this extensibility and approach that I think has differentiated it significantly from not only FLOSS competitors but proprietary competitors. This capability coupled with the community has enabled rapid innovation at times. It's also what will ensure our longevity.
-
What is your favorite PostgreSQL extension or tool? And why?
PostGIS. I've used a fair array of extensions over the years, but PostGIS has definitely been the most common. Even if I don't always use it very much. It's got such an immense amount of power and can make doing some very complicated things really easy.
-
What advice would you give to someone starting their journey with PostgreSQL?
Stay curious and keep on playing, I think you learn by doing, so keep on practicing. You get good by trying things, making mistakes and learning from them. Read the manual, try something, if it goes wrong come and ask for help. Don't just stay in your comfort zone, there are always many ways to solve problems.
-
What are your favorite resources for learning about PostgreSQL?
Planet Postgres is great for finding out about new features, for learning about how others are approaching things.
The PostgreSQL docs are a very underrated resource, they are very comprehensive.
I'd add going to a conference is a great way to discover other ideas and to chat with other people about their ways of doing things and experiences.
-
Could you share a memorable experience or challenge you faced while working with PostgreSQL?
There have been so many over the years. A very rewarding project for me was working on a Smart Energy Analytics platform that had PostgreSQL at its heart. The data volumes only ever grew and the logic only ever got more complex. It was a constant learning of new features to use to solve problems, different performance issues to dig into and try to tune.
Probably the most fun single project was building a real-time election map, back in the era of IE6. PostgreSQL and PostGIS really enabled us to push the boundaries of what was possible back then. The company's core mapping system was junk compared to what we managed to pull off with open source tools and a bit of passion. We ended up in one evening pushing the same amount of bandwidth as the website did in a whole year. Plus being in the count hall inputting data was pretty cool.
-
In your opinion, what are the most common pitfalls or mistakes developers make when working with PostgreSQL?
Viewing the database only from the eyes of their ORM. TBH I hate ORMs, my view is probably still tainted from using Hibernate before Java annotations, that was such an awful experience our whole team vowed never to touch it again. I'm pragmatic enough to say, this doesn't mean never tough an ORM. Just use one that still lets you write actual SQL when you need to. I really don't want to read yet another nested loop join in application code, please.
The other thing I commonly find and really annoys me, is:
VARCHAR(255)
. This is a great way to show that you've paid zero attention to what you're doing. I don't think I've come across an actual use case where the max string length is actually 255 and most people have just cargo culted this from MySQL or other developers. Just useTEXT
and add a CHECK constraint if you need it. -
Which skills are a must have for a PostgreSQL user/developer?
Curiosity. That desire to go and learn how to get the most out of PostgreSQL.
-
What is the most overlooked thing about PostgreSQL?
Hmm, probably the fact it just works, in so many cases. I think it's really easy to only hear about the problems or the tricky cases. But for every person that runs into an issue with wrap-around or performance, etc, there are thousands of perfectly fine running systems, probably getting little maintenance or attention.
Plus just how powerful PostgreSQL is at solving problems, SQL is a great language for solving lots of data problems with ease. PostgreSQL has really good support for the standard and lots of nice extensions that I find really help you solve application problems easily.
-
PostgreSQL is open-source, did that ever help you in anyway and how?
One significant fun moment comes to mind, we had a rather complex bill processing stored function. We left it running over the weekend, in the office on Monday it still hadn't finished, CPU sat at 100%, very little disk IO, all a bit fishy. Used strace, very little syscalls happening, attached GDB, hmm spending a lot of time in SMgr unreeling a singly linked list or unowned relations. Our job had effectively been creating a new temp table inside the loop, we had around 6 million temp table files on disk.
I looked up the code to find a comment: 'In the worst case this could be rather slow; but in all the cases which seem likely to be performance-critical…'. Essentially we'd triggered an edge case no one thought would be common. The fix for our function was pretty trivial and over the years this handling in PostgreSQL looks to have changed a fair bit.
But it was definitely a time that being able to read the code really helped identify and fix a problem quickly.
-
If you had a magic wand, what single thing would you change in PostgreSQL as it is today?
My views on this seem contentious, and I know it's not easy and would require a lot of work, but I think it's important for PostgreSQL to try and tackle it at some point.
That would be being able to support cross major version streaming replication.
I often find upgrades are still a lot of people's big headaches, pg_upgrade is cool and good for lots of situations.
I think we need to spend the time trying to evolve our storage format to support backwards compatibility. So that upgrades could just become install the new binaries and start up, or connect the next version as a streaming replica. I'd also go as far to say supporting cross CPU / OS architecture replication as part of it.
About POSETTE & Events
-
Have you enjoyed previous POSETTE (formerly Citus Con) conferences, either as an attendee or as a speaker?
This will be my first time attending in either way. Virtual conferences have always been something I've not been attracted to in the past. I think because a lot of events just tried to copy their in person formats to virtual, that always made me feel a bit depressed.
But the world has moved on a bit now, so I'm excited to be part of an event which at least is trying not to just copy the in person formats.
-
What motivated you to speak at this year’s POSETTE: An Event for Postgres?
Well, Claire has to take a lot of credit for keeping reminding me to submit some abstracts. Plus some questions on one of my current projects prompted the thought for my talk abstract which got selected.
-
What other PostgreSQL events in 2024 are you excited about?
Well, I'm very excited about PGDay UK 2024 and PGConf EU 2024. I'm helping organise both and really hope they will be excellent events. I'm hoping we can boost attendance for PGDay UK this year, so if any people in the UK please come along, it'll be a great day. Plus while not strictly a PostgreSQL only event, I'm really excited for the return of OggCamp.
-
What advice would you give to fellow speakers preparing for a PostgreSQL conference?
Prepare and practice, especially if you struggle with confidence as a public speaker. It really does get easier as you do it more. But it still easily takes me a few weeks of effort to write and practice a talk.
-
What would be helpful for a first-time speaker?
Firstly, just go for it! Secondly, keep trying. You'll get rejected lots and lots of times before you get accepted. Talk abstracts don't get chosen for lots of reasons, Claire's blog was a great explainer on the process most conferences use, and why you might not get chosen. So keep submitting talks you think are good ideas and submit some variations or other ideas.
Abstracts cost nothing, it's not like you have to have written the talk.
-
Could you share a memorable moment from a previous PostgreSQL conference you attended or spoke at?
On the positive side, I still remember clearly the first talk I ever saw at a PostgreSQL conference, back at PGConf EU 2010 in Stuttgart, Gianni Ciolli talking about building a Chess bot with PostgreSQL stored functions. It was a fun way to talk through the power and capability of functions plus all the stats tracking you get for free to help you optimize your code.
Podcast Appearances
Becoming expert at using PostgreSQL with Chris Ellis
- Podcast episode: Listen on TalkingPostgres.com
Subscribe to notifications to keep up with POSETTE news
Join the conversation
Use the hashtag #PosetteConf
The Postgres team at Microsoft is proud to be the organizer of POSETTE: An Event for Postgres (formerly Citus Con).