Inferential database seeding in Clojure

Getting Started

                
(defproject ...
  :dependencies [[dibble.core "0.2.0-beta1"]]
  ...)
                
            
First off, put the dependency in your project file to download from Clojars.
                
(def db {:vendor :mysql
         :db "db-name"
         :user "user"
         :password "pass"})
                
            
Next, in your Clojure source file, specify a valid connection. Connection's are obtained through Korma, so use that syntax. Don't forget to specify the :vendor!

Values

                
(defseed people
  {:database db :table :people}
  [:value-of :name "Mike"]
  [:value-of :age 21])

(seed-table people)
                
            
value-of is a function that takes a column name and a value to insert into the row. This is useful is you know the concrete data you want to seed with, which is often the case.

Randomness

                
(defseed people
  {:database db :table :people}
  [:randomized :name]
  [:randomized :lucky_number])

(seed-table people)
                
            
randomized is the reason I built this library. This function will inference the underlying table structure and pick a suitable value seed with. If the column 'name' is a text column, Dibble will produce some text. The same goes for numeric types, and anything else that is implemented.

Foreign Keys

                
(defseed pets
  {:database db :table :pets}
  [:inherit :pet_id]
  [:randomized :name])

(defseed people
  {:database db :table :people}
  [:randomized :number :fk {pets :pet_id}])

(seed-table people)
                
            
Data of any substantial complexity is going to be using foreign keys. Dibble handles this gracefully. inherit is a function that receives a value as a foreign key. The foreign key is delivered to the seed by another seed via the :fk option. :fk takes a map of tables to columns to push its value to.

Type Options

                
(defseed people
  {:database db :table :people :n 50}
  [:randomized :name :length 16]
  [:randomized :number :min 5 :max 10])

(seed-table people)
                
            
Dibble takes options to randomized, inherit, value-of, and defseed. :n indicates the number of rows to insert. It's probably easy to determine what the rest of these do, but I'll enumerate them below just in case it isn't.

String options

Option Usage Description Mutual exclusion
:min :min 5 Minimum of 5 characters :length
:max :max 8 Maximum of 8 characters :length
:length :length 6 Length is exactly 6 characters :min, :max
:first-name :subtype :first-name An English first name :length, :min, :max
:last-name :subtype :last-name An English last name :length, :min, :max
:full-name :subtype :full-name An English full name. String, space, string. :length, :min, :max
:spaces? :spaces? true String may contain spaces Not yet implemented
:symbols? :symbols? true String may contain symbols Not yet implemented
:digits? :digits? true String may contain digits Not yet implemented

Integer & Decimal options

Option Usage Description Mutual exclusion
:min :min 20 Minimum value of 20, inclusive
:max :max 80 Maximum value of 80, inclusive
:even? :even? true Generated number is even Not yet implemented
:odd? :odd? true Generated number is odd Not yet implemented
:prime? :prime? true Generated number is prime Not yet implemented

Type Overriding

                
(def db {:db "simulation" :user "" :password "" :vendor :sqlite3})
(def type-map {:unknown_column_type {:type :integer :min 0 :max 10}})

(seed-table {:database db :table :people :types type-map}
  [:randomized :unknown_column_type])
                
            
Type overriding let's you specify what Dibble type a column should map to. This is supported for SQLite3.

Policies

                
(defseed people
  {:database db :table :people
   :policy :clean-slate :n 200}
  [:randomized :name]
  [:randomized :number])

(seed-table people)
                
            
Policies let Dibble know what do before writing data to a table.

Policy options

Policy Usage Description
Append {:policy :append} Only append to the table. Append is the default.
Clean Slate {:policy :clean-slate} Delete all rows in :table before seeding
Dependent {:policy :clean-slate :dependents [:pets :children]} Apply the specified policy to all tables in :dependents

Support

Database Supported?
MySQL Yes
Postgres Yes
Oracle Not yet implemented
MSSQL Not yet implemented
SQLite3 Yes
MongoDB Not yet implemented

MySQL support

MySQL Type Maps to Dibble type
varchar :string
int :integer
float :decimal
double :decimal
decimal :decimal
char :string
tinytext :string
mediumtext :string
text :string
longtext :string
tinyint :integer
smallint :integer
mediumint :integer
bigint :integer
date :datetime
datetime :datetime
time :datetime
timestamp :datetime
blob :binary
mediumblob :binary
longblob :binary
enum Not supported
set Not supported

Postgres support

Postgres Type Maps to Dibble type
varchar :string
integer :integer
double precision :decimal
decimal :decimal
numeric :decimal
char :string
text :string
smallint :integer
bigint :integer
date :datetime
timestamp :datetime
bytea :binary

SQLite3 Support

SQLite3's unique type system makes it harder, but not impossible to infer type information. The solution has been to infer column information if the column name matching any of the supported MySQL column types. If the column type can't be infered, Dibble will look to the :types map to pick the correct type.
Copyright 2012 Michael Drogalis

Back to Top