Postgresql: is it better using multiple databases with 1 schema each, or 1 database with multiple schemas?


Postgresql: is it better using multiple databases with 1 schema each, or 1 database with multiple schemas?



After this comment to one of my question, I'm thinking if it is better using 1 database with X schemas or vice versa.

My situation: I'm developing a web-app where, when people register, I create (actually) a database (no, its not a social network: everyone must have access to his own data and never see the data of the other user).

That's the way I used for the previous version of my application (that is still running on mysql): through the plesk api, for every registration, I do:

  1. Create a database user with limited privileges;
  2. Create a database that can be accessed just by the previous created user and the superuser (for maintenance)
  3. Populate the db

Now, I'll need to do the same with postgresql (the project is getting mature and mysql.. don't fulfill all the needs)

I need to have all the databases/schemas backups independent: pg_dump works perfectly in both ways, the same for the users that can be configured to access just 1 schema or 1 database.

So, assuming you are more experienced potsgres users than me, what do you think is the best solution for my situation, and why?

Will there be performance differences using $x db instead of $x schemas? And what solution will be better to maintain in future (reliability)?

Edit: I almost forgot: all of my databases/schemas will always have the same structure!

Edit2: For the backups issue (using pg_dump), is maybe better using 1 db and many schemas, dumping all the schemas at once: recovering will be quite simple loading the main dump in a dev machine and then dump and restore just the schema needed: there is 1 additional step, but dumping all the schema seem faster then dumpin them one by one.

p.s: sorry if i forgot some 'W' char in the text, my keyboard suffer that button ;)

UPDATE 2012

Well, the application structure and design are changed so much dirung those last two years. Im still using the 1 db with many schemas approach, but still, I have 1 database for each version of my application:

Db myapp_01     \_ my_customer_foo_schema     \_ my_customer_bar_schema Db myapp_02     \_ my_customer_foo_schema     \_ my_customer_bar_schema 

For backups, im dumping each database regularly, then moving the backups on the dev server.

Im also using the PITR/WAL backup but, as I said before, its not likely i'll have to restore all database at once.. so it will probably be dismissed this year (in my situation is not the best approach).

The 1-db-many-schema approach worked very well for me since now, even if the app structure is totally changed:

i almost forgot: all of my databases/schemas will always have the same structure!

...now, every schema has its own structure that change dinamycally reacting to users data flow.




Using Amazon's EBS for MySQL hot backup

1:



Should 'system data' be in a database?
A PostgreSQL "schema" is roughly the same as a MySQL "database".


Real world MySQL/Postgres database schema examples and analysis tools
Having many databases on a PostgreSQL installation can get problematic; having many schemas will work with no trouble.


Personal names in a global application: What to store
So you definitely want to go with one database and multiple schemas within that database..
Is there a way to get a difference report on two Jet (.mdb) databases?


free country, city database for sql server


Database development mistakes made by application developers [closed]

2:



Finding alphabetical position in a large list
Definitely, I'll go for the 1-db-many-schemas approach.

This allows me to dump all the database but restore just 1 very easily, in many ways:.
  1. Dump the db (all the schema), load the dump in a new db, dump just the schema i need, and restore back in main db
  2. Dump the schema separately, one by one (but I think the machine will suffer more this way - and I'm expecting like 500 schemas!)
Otherwise, googling around I've seen that there is no auto-procedure to duplicate a schema (using one as a template), but many suggest this way:.
  1. Create a template-schema
  2. When need to duplicate, rename it with new name
  3. Dump it
  4. Rename it back
  5. Restore the dump
  6. The magic is done.
I've written 2 rows in python to do that; I hope they can help someone (in-2-seconds-written-code, don’t use it in production): .
import os import sys import pg  #Take the new schema name from the second cmd arguments (the first is the filename) newSchema = sys.argv[1] #Temp folder for the dumps dumpFile = '/test/dumps/' + str(newSchema) + '.sql' #Settings db_name = 'db_name' db_user = 'db_user' db_pass = 'db_pass' schema_as_template = 'schema_name'  #Connection pgConnect = pg.connect(dbname= db_name, host='localhost', user= db_user, passwd= db_pass) #Rename schema with the new name pgConnect.query("ALTER SCHEMA " + schema_as_template + " RENAME TO " + str(newSchema)) #Dump it command = 'export PGPASSWORD="' + db_pass + '" && pg_dump -U ' + db_user + ' -n ' + str(newSchema) + ' ' + db_name + ' > ' + dumpFile os.system(command) #Rename back with its default name pgConnect.query("ALTER SCHEMA " + str(newSchema) + " RENAME TO " + schema_as_template) #Restore the previus dump to create the new schema restore = 'export PGPASSWORD="' + db_pass + '" && psql -U ' + db_user + ' -d ' + db_name + ' < ' + dumpFile os.system(restore) #Want to delete the dump file? os.remove(dumpFile) #Close connection pgConnect.close() 


3:


A number of schemas should be more lightweight than a number of databases, although I cannot find a reference which confirms this.. But if you really want to keep things very separate (instead of refactoring the web application so that a "costomer" column is added to your tables), you may still want to use separate databases: I assert that you can more easily make restores of a particular customer's database this way -- without disturbing the other customers..


4:


I would say, go with multiple databases AND multiple schemas :). Schemas in postgres are a lot like packages in Oracle, in case you are familiar with those.

Databases are meant to differentiate between entire sets of data, while schemas are more like data entities.. For instance, you could have one database for an entire application with the schemas "UserManagement", "LongTermStorage" and so on.

"UserManagement" would then contain the "User" table, as well as all stored procedures, triggers, sequences etc.

that are needed for the user management.. Databases are entire programs, schemas are components..


5:


In a postgres-context I recommend to use one db with multiple schemas, as you can (e.g.) UNION ALL across schemas but not across databases.

For that reason, a database is really completely insulated from another database whilst schemas are not insulated from other schemas within the same database.

If you -for some reason- have to consolidate data across schemas in the future, it will be easy to do this over multiple schemas.

With multiple databases you would need multiple db-connections and collect and merge the data from each database "manually" by application logic.

. The latter have advantages in some cases, but for the major part I think the one-database-multiple-schemas approach is more useful..



85 out of 100 based on 65 user ratings 715 reviews