jHolidays project

Some time ago I’ve decided to create a small utility which will be able to create ICal calendar using input file (listing of events, holidays etc).

Intro

But during implementation I’ve created seprate google code project jHolidays. Also goals of this project have slightly changed. I’d like to have a piece of software which allows me to specify events (holidays, birthdays, weekends, special occasions and so on) in clear, simple plain format. And this software translates this simple declarative representation of events into object model. Imagine we need to create list of birthdays of friends and colleagues. This is simple since each birthday has fixed date which means that only year changes, but month and day of month is fixed. For example:

# Name Month Day Original (dd.mm.yyyy)
1 John 3 24 24.03.1950
2 Jack 4 25 25.04.1980
3 Lizzie 5 26 26.05.1985
4 Boss 6 20 20.06.1965

Half (or even more) legal holidays are fixed events. For example International Women’s Day is fixed event. It occurs on the 8th of March each year. Another category of events are floating ones. Before we describe floating events in details let’s review some small examples.

  1. The Third Saturday in October is the name of an annual college football rivalry game between the University of AlabamaCrimson Tide and the University of Tennessee Volunteers.
  2. First Friday is a city-wide public event that occurs on the first Friday of every month. The events may take on many purposes, including art gallery openings and social networking.
  3. Memorial Day is a United States federal holiday observed on the
    last Monday of May
  4. Labor Day is a United States federal holiday observed on the
    first Monday in September

Those events do have three main aspectes:

  • Month ( January, February…)
  • Day of week (Sunday, Monday…)
  • Order of day of week in month (e.g. first, second … last)

Finally there is the third category –  dependent events. These ones depend on another holiday or event. For example several Christian holidays are dependent on easter’s date:

  1. The Palm Sunday: -7 days from Easter
  2. Ascension Day: +39 days from Easter
  3. Trinity Sunday: +49 days from Easter

Those events has two main aspects:

  • Parent event.
  • Offset

Offset is a number of days which should be added (or subtracted) from parent event’s date in order to calculate dependent event date.

Data model

Let’s summarize and introduce event’s data model.

Holidays class diagram

Holidays class diagram

As you can see from the picture there is  IEvent interface at the top of the hierarchy. AbstractEvent is the skeletal implementation of IEvent interface. Concrete subclasses DependentEvent, FixedEvent and FloatingEvent do have protected constructors since library uses Abstract Factory pattern in order to create instances. This will be described in details later. Each concrete subclass must override those methods:

  1. getDate(int year) – this method must return the date when event occurs in the given year.
  2. getID() – each event instance must have it’s own unique identifier (like GUID).
  3. getDescriptor() – must return event’s descriptor.

Event descriptor stores all the data that is needed to create concrete event. Here is it’s structure: eventDescriptorClass As you can see event descriptor stores several data fields and getters/setters for them.

  1. id – stores unique identifier of event. When events are being built the library loads all event descriptors (from file, database, network, etc) and checks whether they do have unique identifiers.
  2. parentID – stores identifier of parent of given event. For dependent events this field must store valid identifier of another event. Both parent and dependent events in this case should be stored in the same datasource so library will be able to link them. For independent events this field must store EventDescriptor.ROOT_ID constant
  3. name – event name. E.g. “my holiday”, “my birthday”, “Easter” and so on
  4. description – event description. Text that is associated with given event and describing its details
  5. expression – filed which is used by library to detect event type and it’s occurrence date.

Here is the example of event descriptor which defines fixed event, having no parent:

Field name Field value
id 1
name My birthday
descrition Occurs each year at the same date (1 October).
parentID 0
expression 01.10

Here is the example of two event descriptors that define two floating events

Field name Field value
id 2
name Thanksgiving Day
descrition Occurs each year on the 4th Thursday of November.
parentID 0
expression 4%Thu%Nov
id 3
name Memorial Day
descrition Is celebrated on the Last Monday of May
parentID 0
expression L%Mon%May

The general form of expression field for floating events (using java Regex format) is: “([1-4L])%(Sun|Mon|Tue|Wed|Thu|Fri|Sat)%”(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)” Finally here is the example of dependent event descritor:

Field name Field value
id 4
name My friend’s birthday
descrition Occurs each year on the 5th of December
parentID 0
expression 05.12
id 5
name Week after birthday celebration
descrition The week after my friend’s birthday
parentID 4
expression +7

As you can see from previous example there are two event descriptors. First is fixed event and has id=4. Second one is dependent event and has id=5. Second event also has parentID=4. Using parentID field of some event we specify parent event for given one. Expression field in this case specifies offset in days which is added to parent event’s occurrence date. In the given example first event occurs each year on the 5th of December. Dependent event occurs on the  12th of December ( +7 days from parent).

Building events

By “building events” we mean the process of transforming EventDescriptor instances into instances of AbstractEvent subclasses. First of all we need to load descriptors collection from some external data source e.g. some file, database, network and so on. For this case library provides special collection and multiple readers:

jholidaysIO

jHolidays IO

As you can see from the class diagram there is IDescriptorReader interface which declares read() method. This method  is the key to the whole library’s IO structure. Concrete implementations of IDescriptorReader interface are used to read some external datasource (e.g. csv file, xml or database) and produce DescriptorCollection.

Descriptor collection is the extension of java.utils.ArrayList and provides the set of methods for validation of event descriptors after they have been parsed from external datasource and have been put into this collection.

Let’s review all concrete implementations of IDescriptorReader interface and their usage examples.

CsvReader

CsvReader is used to read DescriptorCollection from CSV file. Each line of this file should fit special criteria:

Java Regex: (\\d+)HH(.+)HH(.+)HH(\\d*)HH(.+)$
or
Verbal: id HH event name HH event description HH parentID HH expression

In both cases HH stands for CSV delimeter (for example “;” or “;;;”). Here is the example of well-formed CSV file with 5 event descriptors in it:

1;April Fools' Day;April Fools' Day or All Fools';;01.04
2;Valentine's Day;A holiday celebrated on February 14 by many people throughout the world;;14.02
3;Thanksgiving Day;The 4th Thursday of November;;4%Thu%Nov
4;Week after Thanksgiving Day;Example of dependent event;3;+7
5;Two weeks before Thanksgiving Day;Another example of dependent event;3;-14

This CSV file defines those events:

id name description parentID expression
1 April Fools’ Day April Fools’ Day or All Fools’ 01.04
2 Valentine’s Day A holiday celebrated on February 14 by many people throughout the world 14.02
3 Thanksgiving Day The 4th Thursday of November 4%Thu%Nov
4 Week after Thanksgiving Day Example of dependent event 3 +7
5 Two weeks before Thanksgiving Day Another example of dependent event 3 -14

Here is the code which is used to read given CSV file:

import java.util.List;

import com.google.code.jholidays.core.DescriptorCollection;
import com.google.code.jholidays.core.EventBuilder;
import com.google.code.jholidays.exceptions.NotSupportedEventException;
import com.google.code.jholidays.io.IDescriptorReader;
import com.google.code.jholidays.io.csv.CsvReader;
import com.google.code.jholidays.plugins.IEvent;

public class Main {

    public static void main(String[] args) throws IllegalArgumentException,
	    NotSupportedEventException {

	IDescriptorReader reader = new CsvReader("input.csv", ";");
	DescriptorCollection coll = reader.read();

	System.out.println(coll.size());

	EventBuilder builder = new EventBuilder();
	List<IEvent> events = builder.buildEvents(coll);

	final int year = 2009;

	for (IEvent event : events) {
	    System.out.println(event.getDate(year));
	}
    }
}

And here is the output:

Wed Apr 01 00:00:00 MSD 2009
Sat Feb 14 00:00:00 MSK 2009
Thu Nov 26 00:00:00 MSK 2009
Thu Dec 03 00:00:00 MSK 2009
Thu Nov 19 00:00:00 MSK 2009

XmlReader

This implementation of IDescriptorReader interface is used to read DescriptorCollection from XML file. Here is xsd schema that any XML file should match in order to be read by XmlReader:

<xml version="1.0" encoding="UTF-8"?>
 <xml schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.org/DescriptorCollectionXml" xmlns:tns="http://www.example.org/DescriptorCollectionXml" elementFormDefault="qualified">

    <complexType name="EventDescriptor">
    	<sequence>
    		<element name="id" type="int" maxOccurs="1" minOccurs="1"></element>
    		<element name="name" type="string" maxOccurs="1"
    			minOccurs="1">
    		</element>
    		<element name="description" type="string" maxOccurs="1"
    			minOccurs="0">
    		</element>
    		<element name="parent_id" type="int" maxOccurs="1"
    			minOccurs="0">
    		</element>
    		<element name="expression" type="string" maxOccurs="1" minOccurs="1"></element>
    	</sequence>
    </complexType>

    <complexType name="DescriptorCollection">
    	<sequence>
    		<element name="EventDescriptor" type="tns:EventDescriptor" maxOccurs="unbounded" minOccurs="0"></element>
    	</sequence>
    </complexType>

    <element name="descriptors" type="tns:DescriptorCollection"></element>
 </schema>

Here is the example of XML file:

<descriptors xmlns="http://www.example.org/DescriptorCollectionXml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.example.org/DescriptorCollectionXml">
	<descriptor>
		<id>1</id>
		<name>April Fools’ Day</name>
		<description>April Fools’ Day or All Fools’</description>
		<expression>01.04</expression>
	</descriptor>
<descriptor>
		<id>2</id>
		<name>Valentine’s Day</name>
		<description>A holiday celebrated on February 14 by many people throughout the world</description>
		<expression>14.02</expression>
	</descriptor>
<descriptor>
		<id>3</id>
		<name>Thanksgiving Day</name>
		<description>The 4th Thursday of November</description>
		<expression>4%Thu%Nov</expression>
	</descriptor>
<descriptor>
		<id>4</id>
		<name>Week after Thanksgiving Day</name>
		<description>Example of dependent event</description>
        <parent_id>3</parent_id>
		<expression>+7</expression>
	</descriptor>
<descriptor>
		<id>5</id>
		<name>Two weeks before Thanksgiving Day</name>
		<description>Another example of dependent event</description>
        <parent_id>3</parent_id>
		<expression>-14</expression>
	</descriptor>
</descriptors>

As you can see xmlns schema should be specified inside XML file. This is due to the fact that XmlReader performs validation of every input file against the schema.

If there is any independent event, you shouldn’t specify <parent_id></parent_id> for it inside XML.

Example code is almost the same as for CsvReader except two lines:

 //... same code as for CsvReader here ...
 IDescriptorReader reader = new XmlReader("input.xml");
 DescriptorCollection coll = reader.read();
 //... same code as for CsvReader here ...

Here is the output:

Wed Apr 01 00:00:00 MSD 2009
Sat Feb 14 00:00:00 MSK 2009
Thu Nov 26 00:00:00 MSK 2009
Thu Dec 03 00:00:00 MSK 2009
Thu Nov 19 00:00:00 MSK 2009

JdbcReader

This implementation of IDescriptorReader interface is used to load DescriptorCollection from any JDBC source. Event descriptors should be stored inside separate database table. Here is database diagram:

[dropbox image descriptorsTableSqlite.png]

And here is DDL (table name is considered to be event_descriptors):

SQlite CREATE TABLE “event_descriptors” (
“id” INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
“name” TEXT NOT NULL,
“description” TEXT,
“parent_id” INTEGER,
“expression” TEXT NOT NULL
)
PostgreSQL CREATE TABLE event_descriptors
(
id serial NOT NULL,
“name” character varying NOT NULL,
description character varying NOT NULL,
parent_id integer,
expression character varying NOT NULL,
CONSTRAINT “FK_Events” PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
MS SQL SET ANSI_NULLS ON
GOSET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[event_descriptors](
[id] [int] IDENTITY(1,1) NOT NULL,
[name] [varchar](256) NOT NULL,
[description] [varchar](4096) NULL,
[parent_id] [int] NULL,
[expression] [varchar](256) NOT NULL
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
MySQL CREATE TABLE `test`.`event_descriptors` (
`id` INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
`name` VARCHAR(256) NOT NULL,
`description` VARCHAR(4096),
`parent_id` INTEGER UNSIGNED,
`expression` VARCHAR(256) NOT NULL,
PRIMARY KEY (`id`)
)
ENGINE = InnoDB;

Here is the code which shows how to use JdbcReader:

	// ===PostgreSQL===
	//
	// String url = "jdbc:postgresql://127.0.0.1:5432/db_name";
	// Properties props = new Properties();
	// props.setProperty("user", "login");
	// props.setProperty("password", "pass");
	// Connection conn = DriverManager.getConnection(url, props);

	// ===MySQL===
	//
	// Connection conn = DriverManager
	// .getConnection("jdbc:mysql://localhost/db_name?user=login&password=pass");

	// ===MSSQL===
	//
	// Connection conn = DriverManager
	// .getConnection("jdbc:sqlserver://localhost:1433;databaseName=db_name;"
	// + "user=login;password=pass;");

	// ===Sqlite===
	//
	Class.forName("org.sqlite.JDBC");
	Connection conn = DriverManager.getConnection("jdbc:sqlite:test_db");

	IDescriptorReader reader = new JdbcReader(conn, "table_name");
	DescriptorCollection coll = reader.read();

	System.out.println(coll.size());

	EventBuilder builder = new EventBuilder();
	List<IEvent> events = builder.buildEvents(coll);

	final int year = 2009;

	for (IEvent event : events) {
	    System.out.println(event.getDate(year));
	}

You can download latest version of library from here

Advertisements

6 Responses to “jHolidays project”

  1. Sven Says:

    Hi,

    there is an open source project to calculate holidays for a given country/year/state/region. Please see http://jollyday.sourceforge.net

    Hope this can help.

    Cheers, Sven

  2. tillias Says:

    Why not to contribute to existing project (it even supports plugins for *special* events) and reinvent the wheel?

  3. Sven Says:

    Hi Tillias,
    there is no reason to be upset and to flame on someones project reviews. I was creating my project because I saw the need to have an API which does not rely on any db or network resource.
    And to have two projects does not mean that they cannot cooperate. For example you could use the XML files of mine to fill up your DB.
    By the way there is already one project which started long before yours. This guy could have said the same about your project. Look at Ulrich Hilgers webservice which does pretty much what you are trying to accomplish.

    • tillias Says:

      Well, why flame? I’ve looked carefully (several hours) through your sources before posting. My project doesn’t rely on database or network resources… It can do it optionally though. It also uses XSD validation (please refer to wiki of project).

      Nothing personal, sorry. I’m opened to any well-founded critics.

      As for Ulrich Hilgers – I’ve googled much before creating… Sorry I haven’t found it.

  4. Sven Says:

    See Ulrich Hilgers webservice under http://www.daybase.eu/ .

    As far as I have looked through your code your API delivers some kind of infrastructure to handle events. It does not (currently?) deliver the data itself.
    Mine has not much code but is mainly made up of data files.
    This may give you the chance to use my data files to enrich your project. Thats one of the big advantages of open source.

    • tillias Says:

      Yes, you’re right its kinda infrastructure. Data is delivered by end user from different types of data sources. Sorry if I made you think about aggressiveness from me to your project. You can always edit my review or trash it though (or I can do it if you ask).

      About data files — it would be really hard to incorporate many types of cultures into one small open source project :( Personally I wanted to do it (base revision) but there are tons of Google Cals and iCals over internet. So I gave up and made infrastructure. Last month I’m working on “provider” which can reuse iCals in order to use them as data sources… Maybe some day I’ll have the time to do provider for Google Cals


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: