Limits to using XML to store data?

shoris · February 27, 2021, 1:09am

I’m making a turn-based RPG and I’m in need of ways to store data. Thus far, I have been planning on using a series of XML docs to store save data, NPC data, dialogue, item info, monster info, and skill info. I’m also using Tiled maps, so I have those tmx files in my game as well. While I was researching XML usage, I realized that it’s on the slower side and that there are size limitations. The game is very narrative-heavy, so there will be a ton of dialogue ids. Also, I’m not using the Monogame content pipeline with the XMLs, if that’s relevant.

So, my question: are these XML docs I have planned too much? I already have a lot of work done on the XML side so I’d like to keep them if it’s fine, but I don’t want to run into issues as my game grows. I’ve been looking into sqlite and it seems like a good option for all the monster/npc/dialogue data, but I’d like to keep my save data as an XML-- is mixing data storage anti-best practice?

MrValentine · February 27, 2021, 2:33am

What best practices? Please for the life of you, forget the idea of ‘Best Practices’…

I am looking into SQL based design myself.

EDIT

Additional Data

Best Practices - Aren’t (forbes.com)

shoris · February 27, 2021, 2:47am

Ha true. I only ask because I seem to come up against weird roadblocks when I least expect them…

Synammon · February 27, 2021, 2:59am

No problem mixing them. I use a combination of binary files and XML in one project and it works fine. XML works well for dialogue for me. In addition to XML and sqlite you can look at JSON. It’s fairly lightweight and there is good support for serialization.

shoris · February 27, 2021, 8:19pm

That’s good to know! Did you run into any size limitations for XML dialogue storage? How much dialogue did that project have? I have working dialogue using XML, but I have no idea what will happen when I start writing a lot more

jamie_yello · February 28, 2021, 2:49am

I have used over 100,000 objects in a list that serialize and deserialize to xml in a couple of seconds (probably a total of around 1,000,000 fields). This was on a hard drive, not an ssd.

Surprisingly reliable and fast.

Synammon · February 28, 2021, 5:16am

I haven’t run into size issues with XML. I use it to store my maps too which tend up being more massive than the dialogue. Single map file is usually 10MB. Dialogue don’t get near that size in my experience.

MrValentine · February 28, 2021, 5:51am

Excuse me whilst I get nerdy a little… but this intrigued me…

Word Counts of the Most Popular Books in the World - Foster Grant UK

WORD COUNTS OF THE BOOKS IN J.R.R. TOLKIEN’S LORD OF THE RINGS SERIES:

The Hobbit – 95,356 words

The Fellowship of the Ring – 187,790 words

The Two Towers – 156,198 words

The Return of the King – 137,115 words

The entire Lord of the Rings series (including The Hobbit) – 576,459 words

If we assume the average word contains say, five letters, just for the sake of this equation… and note that maths is nobody’s strong point… let us assume there are – taking 576,459 words multiplying by 5 to represent 5 characters per word, giving – 2,882,295 letters in the entire novel, we then assume we are using 8 bits per letter or (char) giving us 23,058,360 bits and if we convert that to Megabytes we get 2.7488 Megabytes according to this calculator site:

Convert Bits to Megabytes (bit → MB) (convertlive.com)

So yeah, assume the overhead adding some 30-40% additional data to the file in XML and viola, still under 10MB…

I hope someone finds this useful

Stainless · February 28, 2021, 11:12am

First thing why use xml?

Human readable
Structured

Why not to use XML

Inefficient for size
Comparatively slow to read
Insecure

What do I mean by that, well let’s take a single boolean

The XML version of it will be something along the line of this

 `<valuename>true</valuename>`

Roughly 29 bytes

The binary version will be a single byte, if you have a lot of booleans you could group them together and have a single bit for a boolean

Why is it insecure?
Well it’s human readable , so anyone with a text editor can read it and change it

Why is it slow to read?
You have to parse the structure to get the value, a binary version would be a single call to ReadByte()

Does this mean that XML is evil?

Yes and no, it depends on your game design. If you don’t care about the problems with XML, then the advantages of the structured human readable file are attractive.

For me personally, I never use XML anymore, if I need to I will use JSON in an editor (which is pretty much a sub set of XML I know), but never in game…

In game I only use binary data, I often structure data with L2 cache size in mind, so you can read every value in a structure without invalidating the cache, but then I am paranoid.

For me the best solution is to use something human readable in the editor, and binary in game.

MrValentine · February 28, 2021, 2:44pm

Excuse me while I drop this here: [for me and @Stainless ]

How Does CPU Cache Work? What Are L1, L2, and L3 Cache? (makeuseof.com)

Cache - CPU and memory - GCSE Computer Science Revision - BBC Bitesize

Inrelation to:

My argument is, 256KB would be your core focus, which could create limitations for a lot of stuff, but I suppose it is a good wall to code against.

I mean 256KB is huge…

Anyway, resume…

shoris · March 1, 2021, 2:33am

One more question-- I haven’t been using the pipeline/mgcb editor, just the plain XML files and parsing with C#-- would that have longterm negative/slowing effects?

MrValentine · March 1, 2021, 3:12am

Doubt it, just make sure to dispose and the like…

dmanning23 · March 3, 2021, 3:56pm

The reason for doing something like Sqlite vs xml is that you can store unstructured data in XML and treat it like a no-sql datastore.

Sqlite is a relational database, so for example you might have a Dungeon table, a Monster table, and a DungeonMonster table for storing instances of monsters that are contained in each dungeon. This table would only need a foreign key to the dungeon, a FK to the monster, and the location where the monster spawns in the dungeon.

If you are comfortable using XML, and aren’t used to thinking relationally anyway… I don’t see any reason why you should switch.

Maybe think about abstracting all the storage behind a repository pattern though so you can switch out your storage solutions easily. Personally, I usually do three different repositorys: an in-memory version for testing and quick development, a Sqlite version for simple local storage, and then the full-blown cloud implementation that integrates with PlayFab or Firebase.

Hope this helps.
Cheers!

MrValentine · March 3, 2021, 7:26pm

Yeah, I was going to mention using multiple XML files for various locations or things as you can load, read, write, close, dispose, at any time.

EnthusiastGuy · March 3, 2021, 9:03pm

Hi there!

If I may make a suggestion, try and fragment the information you need over multiple files. May them be json or XML, it should be fine as long as you break them off.

But consider the nature of the data you are using. I assume you have some dialogue data, items, triggers, areas, characters. How much of that you actually need in the memory at any one time?

If you break off your world into “areas”, then you should be fine. You’ll probably have different tilesets anyway for certain areas, so why not just do that?

Then, each area can load its own resources and you can manage them as you need, leaving the global resources at a minimum. By global assuming maybe your inventory/skills/stashes/heroes/quests progress and so on.

You could also try out the profiler in Visual Studio and monitor how much memory you are using at times, try and experiment with exaggerated lengths of data to push the limits and get some idea of minimum requirements for your game.

Nowadays, RAM is cheap, but HDD(SSD) is cheaper. If you can read and discard, do it.

Good luck!

MrValentine · March 3, 2021, 9:22pm

I think the concern there is mobile users, where storage is pathetic to say anything useful about…

Arcadenut · March 3, 2021, 9:40pm

Compress the files… saves space and text compresses really well.

shoris · March 4, 2021, 3:08pm

Hey thanks for the suggestion to break up the files-- I hadn’t thought of breaking them up between areas, actually

shoris · March 4, 2021, 3:08pm

And thanks everyone again for the help! Now I’m just stuck trying to decide between xml and json.

EnthusiastGuy · March 4, 2021, 10:56pm

Hi again,

Please do not get stuck on implementation details. For the sake of discussion, if it saves you a day of work, take this: “use xml and change later”.

That “later” I believe will come when your current implementation fails to satisfy the development requirements.

There’s a saying somewhere: no time like now. Just code, make something work. It’s easier to erase and rewind rather than not having recorded any kind of progress to learn something from.

Limits to using XML to store data?

WORD COUNTS OF THE BOOKS IN J.R.R. TOLKIEN’S LORD OF THE RINGS SERIES: