Storing files and metadata with NHibernate

Posted by Siim on September 21st, 2010

Now and then there comes a need to save some documents to the database also, along with the other information stored there already. It’s not a big deal to save some additional files to the database, especially when it’s not the main purpose of the application, just some extra. So no need for different storage mechanism.

This time I’m using NHibernate and I thought how I should structure tables and write mappings so that files data gets loaded only then when I really need it (eg. downloading a file). I also save some metadata about the file, some if it has business meaning, some of it doesn’t.

Database

I came up with the following simple schema:

clip_image001

I can load all the related meta-info about the file easily without touching the real content. That kind of schema also allows us to put FILE_DATA table to a different file group in MS SQL Server, so physically it can be located on other drive or partition whereas files metadata is located with our other tables.

Mappings

But this is only database side, there are also NHibernate mappings we have to work on. From NHibernate point of view, it’s not one-to-one relation but more like an one-to-many relation, meaning that File may have 0 or more FileData objects. So internally I mapped it like so, but to the outside I exposed it like one-to-one. I am using FluentNhibernate and here are my mappings for that:

public class FileMap : ClassMap<File>
{
	public FileMap()
	{
		DiscriminateSubClassesOnColumn("FILE_TYPE");

		Id(x => x.Id);
		Map(x => x.Type, "FILE_TYPE")
			.ReadOnly();
		Map(x => x.Name);
		Map(x => x.ContentType);

		HasMany<FileData>(Reveal.Member<File>("_files"))
			.Access.Field()
			.Cascade.AllDeleteOrphan()
			.Inverse();
	}
}

public class FileDataMap : ClassMap<FileData>
{
	public FileDataMap()
	{
		Id(x => x.Id);
		Map(x => x.Content);
		References(x => x.File);
	}
}

And related objects:

public abstract class File : IntegerIdentityEntity
{
	private IList<FileData> _files;

	protected File()
	{
		_files = new List<FileData>();
	}

	protected virtual IList<FileData> Files
	{
		get { return _files; }
	}

	protected virtual FileData FileData
	{
		get { return Files.SingleOrDefault(); }
	}

	public virtual string Type { get; protected set; }

	public virtual string Name { get; set; }

	public virtual string Number { get; set; }

	public virtual string ContentType { get; set; }

	public virtual byte[] Data
	{
		get
		{
			return FileData != null ? FileData.Content : null;
		}
		set
		{
			if (FileData != null && value == null)
			{
				FileData.File = null;
				Files.Remove(FileData);
			}
			else if(FileData != null)
			{
				FileData.Content = value;
			}
			else if(value != null)
			{
				var dataFile = new FileData {File = this, Content = value};
				Files.Add(dataFile);
			}
		}
	}
}

public class FileData : IntegerIdentityEntity,
{
	public virtual File File { get; set; }

	public virtual byte[] Content { get; set; }
}

As you can see, externally there is only one object – File, with property Data which returns the contents of the file as a byte array. Internally it is stored in the FileData collection because it’s the natural way how NHibernate handles such scenarios but that doesn’t mean we must always expose it as collection.

Conclusion

This approach allows to store files and their metadata in a database in a simple way, without adding any extra overload to fetching when it’s not necessary. From a database perspective it don’t add any extra constraints either, because we can put data table to a different partition so keeping the size of the main database reasonable.

Querying localized values in NHibernate

Posted by Siim on March 30th, 2010

In my previous post I talked about how to create model for localized values. Now it’s time to show how I’m querying against that model.

I display language selection to the user. So all localizable data is preferably shown in the language the user chose. When displaying persons list, for example, person name is shown in the language the user chose or the first available localized value if the localization for chosen language was not found. I’ll use HQL queries for that, which return a projection from entity to DTO to contain only data I need. HQL allows better (and easier) optimization than using criteria and it’s easier to write.

So HQL that returns persons list where first name and last name are localized, looks like this:

select distinct new PersonDataContract(person.Id, firstName.Value, lastName.Value, person.ContactCard.Phone, person.ContactCard.Mobile,
	person.ContactCard.Email.Address, l.id, firstName.Language.id, lastName.Language.id)
from Language l, Person as person
	join person.FirstName tfname
	left outer join tfname.Localizations firstName
	join person.LastName tlname
	left outer join tlname.Localizations lastName
where l.id = :langId
	and (firstName.Language.id = l.id or (index(firstName) = 0 and l.id not in (select loc.Language.id from Localization loc where loc.Translation = tfname)))
	and (lastName.Language.id = l.id or (index(lastName) = 0 and l.id not in (select loc.Language.id from Localization loc where loc.Translation = tlname)))
order by lastName.Value, firstName.Value 

I also fetch the language id for localized values because I want to show to the user if returned value was in language he requested.

And matching SQL from NHProf:

persons_query

When an object has many localizable properties then such a query could become too complex, too many joins. In that case it may be better to use different strategy for fetching objects. We can use NHibernate’s multi queries or Futures. I’ll show you similar example, but in a bit different context.

In some situations it would be better to show to user all translations for selected language in a single list. For example when there is a language specialist to translate all values from one language to another. In that case we need to load all translations (for specified context, eg. person names) with localizations. In this case I need also to use paging to limit the result set.

I use multi queries approach here. First I fetch translations for the selected context and languages list. And for each language I load localizations for translations in separate queries. So I make total of two roundtrips to the database.

 public IEnumerable<Translation> GetTranslationsBy(Type translationType, int startItemIndex, int numberOfItems)
{
	// Select IDs first
	var idQuery =
		Session.CreateQuery("select t.id from Translation t where t.class = " + translationType.FullName)
			.SetFirstResult(startItemIndex)
			.SetMaxResults(numberOfItems);
	var temporaryList = Session.CreateMultiQuery()
		.Add(idQuery)
		.Add("from Language").List();

	const string queryFormat = "from Translation t left join fetch t.Localizations l where t.class = {0} and l.Language.id = :{1} and t.id in (:translationIds)";

	var multiQuery = Session.CreateMultiQuery();
	foreach (Language language in (ArrayList)temporaryList[1])
	{
		var langParamName = string.Format("lang{0}Id", language.Id);
		var query = Session.CreateQuery(string.Format(queryFormat, translationType, langParamName))
			.SetInt32(langParamName, language.Id);
		multiQuery.Add(query);
	}
	multiQuery.SetParameterList("translationIds", (ArrayList) temporaryList[0]);

	IEnumerable<Translation> result = new List<Translation>();
	var list = multiQuery.List();
	if (list.Count > 0)
	{
		result = (IEnumerable<Translation>)((ArrayList)list[0]).ToArray(translationType);
	}
	return result;
}

repository.GetTranslationsBy(typeof (PartnerNameTranslation), 0, 10); 

With results:

translations_queries

translations_query1

translations_query2

The third query is similar to the first two, only for different language.

I’m using only the first result set from the second batch. Other queries are only for fetching all available localizations for translations and actual results they return, will be discarded. That way I pre-load all the values and therefore there is no need for lazy loading them which in turn would result in executing multiple queries to the database.

As I mentioned, the last approach may also be used when loading DTO’s, only in that case mapping to DTO should be done by the user (or use AutoMapper in some extent).

Mapping translations in NHibernate

Posted by Siim on February 24th, 2010

Some time ago I wrote about translations. Now was the time to actually implement that feature. Because I don’t have any default language which is always present on an object, I had somehow create it virtually, so in case of translation is not found in current language, the default one (for a given object) will be used. I decided that the first language for an object (that is, the language in which the object was created), is used as a default one.

So my data model looks like this. Firstname_id and lastname_id both refer to the translation table as foreign key relations.

data_model_loc

All the objects were mapped as entities and I was trying to map relation between Translation and it’s Localizations as an ordered list. But I soon discovered that ordered collections don’t support bi-directional associations natively. I had a bi-directional relation between Translation and Localizations. So I had to do index handling myself. For that I created a Index property to Localization which is mapped to order_index column. Property looks like this:

public virtual int Index
{
    get
    {
        return Translation.IndexOf(this);
    }
    private set
    { }
}

And Translation object looks like this:

public abstract class Translation
{
    private IList<Localization> _localizations;
    public Translation()
    {
        _localizations = new List<Localization>();
    }
    public abstract string Context { get; }
    public virtual IEnumerable<Localization> Localizations
    {
        get { return _localizations; }
    }
    public virtual int IndexOf(Localization localization)
    {
        return _localizations.IndexOf(localization);
    }
    public virtual string DefaultValue
    {
        get
        {
            var loc = _localizations.FirstOrDefault();
            return loc != null ? loc.Value : null;
        }
    }
    public virtual string GetCurrentValue(Language language)
    {
        return this[language] ?? DefaultValue;
    }
    public virtual string this[Language language]
    {
        get
        {
            var localization = _localizations.SingleOrDefault(x => x.Language.Locale == language.Locale);
            return localization == null ? null : localization.Value;
        }
        set
        {
            var localization = _localizations.SingleOrDefault(x => x.Language.Equals(language));
            if (localization != null && !string.IsNullOrEmpty(value))
            {
                localization.Value = value;
            }
            else if (localization != null)
            {
                RemoveLocalization(localization);
            }
            else if (!string.IsNullOrEmpty(value))
            {
                AddLocalization(language, value);
            }
        }
    }
    private void RemoveLocalization(Localization localization)
    {
        _localizations.Remove(localization);
    }
    private void AddLocalization(Language language, string value)
    {
        var localization = new Localization(language, value) { Translation = this };
        _localizations.Add(localization);
    }
}

As you can see, I added indexer property to Translation to manipulate with localizations conveniently. Although I’m not sure if using an object as an indexer has any drawbacks later on… Thoughts welcome.

You may have also noticed abstract Context property on Translation. It isn’t strictly required, but I found it useful in my implementation. By using this I can conveniently ask all the translations for person name or for product name, for example. This is useful when users need to translate all product names in a batch, so I can display them on a single form.

And here are example NHibernate mappings:

<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2">
<!-- Translation -->
<class abstract="true" name="Translation" table="[translation]">
    <id name="Id" access="property" column="translation_id">
        <generator class="identity" />
    </id>
    <discriminator column="context"/>
    <list name="Localizations" cascade="all-delete-orphan" fetch="join" access="field.camelcase-underscore" inverse="true">
        <key column="translation_id" />
        <index column="order_index" />
        <one-to-many class="Localization" />
    </list>
    <subclass discriminator-value="PersonFirstName" name="PersonFirstNameTranslation">
    </subclass>
    <subclass discriminator-value="ProductName" name="ProductNameTranslation">
    </subclass>
    <!-- etc -->
</class>
<!-- Localization -->
<class name="Localization" table="localization">
    <id name="Id" access="property" column="localization_id">
        <generator class="identity" />
    </id>
    <many-to-one name="Translation" class="Translation" column="translation_id" not-null="true" />
    <many-to-one name="Language" class="Language" column="language_id" not-null="true" />
    <property name="Value" column="value"/>
    <property name="Index" column="order_index" type="int" update="true" insert="true" />
</class>
<!-- Person -->
<joined-subclass name="Person" extends="Party" table="person">
    <key column="party_id" />
    <property name="PersonalCode" column="personal_code" />
    <property name="DateOfBirth" column="birth_date" access="field.camelcase-underscore" />
    <many-to-one name="FirstName" column="firstname_id" class="Translation" fetch="join"
        access="property" cascade="all-delete-orphan" not-null="true"/>
    <many-to-one name="LastName" column="lastname_id" class="Translation" fetch="join"
        access="property" cascade="all-delete-orphan" not-null="true"/>
</joined-subclass>
</hibernate-mapping>

And storing a new object with translations is simple as this:

var person = new Person();
person.PersonalCode = "12345670";
// languages is IList<Language>
foreach (var language in languages)
{
    person.FirstName[language] = "First name: " + language.Name;
    person.LastName[language] = "Last name: " + language.Name;
}
session.Save(person);

This solution has it’s own drawbacks also, but I found that it’s best for my needs. Currently translations are not reusable. By that I mean when one term is used in multiple places, it has to be maintained separately on each instance. Of course, I can make user to choose from existing translations, but it seems to me that this makes things overly complicated. But again, it depends on the exact context :)

Repository and Query Object

Posted by Siim on November 19th, 2009

There has been some discussion over the topic already and there are different opinions. I have been using Repository model all the way so far. I have implemented generic base repository for all common operations and when there are some specific needs (view specific query for example!) for some types then I’ll create specific repository. I know that repositories should be created only for aggregate roots but there I have encountered a little problem. View needs to display data from different entities (aggregates or not) so I cannot always follow the principle of repository per aggregate root. So this poses a problem here – repositories are cluttered with view specific needs. And moreover – repository methods are decorated with sorting and paging parameters which definitely shouldn’t be there. So I decided to change that.

There have been some discussion about using multiple models of the domain for different purposes. For example different kind of a model for reporting purposes. It can even be different kind of data store for that model. Udi and Ayende (didn’t find exact posts anymore) have blogged about that before. Complex views with special needs can be also considered as a way of reporting. In my case, using wholly different model for a view is kind of a overkill so I try to leverage the problem by using query objects separated from the repository. They don’t use repositories by themselves so query objects can be viewed as "specific repositories for views" and they live only in presentation layer. And they don’t operate with domain objects, only with view specific objects like DTOs, filtering specifications. By using this kind of approach, writing complex view specific queries should be pretty easy. I’m not tied to NHibernate criteria, I can use HQL or even ADO.NET. Currently I’m using HQL which returns directly DTOs but I’m thinking of giving a try to LINQ for NHibernate.

When using repositories, I used Expressions to represent sorting and filtering specification. But this seems now overly complex when considering that there were some heavy mappings from string format to Expression format. So now, when it’s only a presentation concern, I can use plain strings, just like they come from the view and do the proper mappings in the query object itself. Like they say – KISS.

Multilingual content in DDD using NHibernate

Posted by Siim on October 29th, 2009

There are not so many business applications which need multilingual content. I exclude any kind of CMS’s, they aren’t “true” business application anyway, in my opinion. But the current project I’m working on has that kind of requirement – most of the data must be available in multiple languages and users should easily and intuitively browse and edit that data in current working language. Part of this requirement is not so easy to achieve. Especially when there is a constraint that there cannot be a default language in it’s standard term – object may be available in any language and even only in that language.

Currently we are using a single translations table where translation is identified by it’s context which is mostly some sort of object/field key or basically can be any string. All the translatable fields are stored in the entity table also as a “original value”. This solution is far from perfect and do not fit very well into domain model in general. It was meant to be a quick draft of the solution.

Usually, this kind of translation should be invisible in domain model perspective. But in my case I think it should be reflected in domain model also, because it IS one of the purpose and requirement of the business application. There are two possible viewpoints to consider – a technical perspective and a user’s point of view. I’ve explored different models that people have used before and I think that the solution that Davy Brion describes seems the best one.

It is tied to the domain object. That means there are direct relation between entity and it’s translations. It simplifies CRUD operations on it, NHibernate can take care of it itself. Searching from localized versions are also quite simple, performance overhead is minimal.
It uses a single translations table. Actually there are couple of tables, but all translations are stored and accessed from one place. This makes later maintenance simpler and don’t create so much noise in the domain model (and in data model also).
It’s quite simple to extend and it fits perfectly into NHibernate. I find that for my solution I should also apply the default language attribute on the translation, because for different object, the term default language can mean different language from what is global default languages, if such exists.

When using this model, every translatable property (a product’s name for example) is an object which holds at least one localization. We don’t have a term original like I have seen in some models, every value is a localization. This makes things simpler in users point of view.

I don’t write about any technical data, you can check that on Davy’s blog, but I will update this blog post when I do any changes to this model. And I’ll try to provide some samples then.


Copyright © 2007 Siim Viikman's blog.