Installing Alfresco Community 4 On Ubuntu Server 12

In this article, I’m going to show you how I installed Alfresco Community 4 on a plain Ubuntu 12.04 server. We’ll be installing this entirely from the command line, without the assistance (complication?) of a GUI. I’m assuming you have a server ready, else get an Ubuntu server using this page.

Second, install Java. Now, this part is a little difficult, since Oracle doesn’t have a means of easily downloading the necessary tar file without jumping through GUI hoops. The way I did it was to use a GUI-based workstation to surf over to the download page, select the JDK download page, give away various legal rights, download it, then put it somewhere where I could wget it onto the above, pristine Ubuntu server. Once you have your grubby paws on it, uncompress it and move the resultant folder into /usr/lib/jvm. Now, if you installed from a virginal, minimal Ubuntu server, as we got from the previous guide, you don’t need to do much else. Other tutorials may have you purging OpenJDK before you take the steps below, or set environmental variables afterwards. This should not be necessary.

$ wget http://shoved.it.here/jdk-7u5-linux-x64.tar.gz
$ tar -xvzf jdk-7u5-linux-x64.tar.gz
$ sudo mkdir /usr/lib/jvm
$ sudo mv jdk1.7.0_05 /usr/lib/jvm/jdk1.7.0
$ sudo update-alternatives --install /usr/bin/javac javac /usr/lib/jvm/jdk1.7.0/bin/javac 1
update-alternatives: using /usr/lib/jvm/jdk1.7.0/bin/javac to provide /usr/bin/javac (javac) in auto mode.
$ sudo update-alternatives --install "/usr/bin/java" "java" "/usr/lib/jvm/jdk1.7.0/bin/java" 1
update-alternatives: using /usr/lib/jvm/jdk1.7.0/bin/java to provide /usr/bin/java (java) in auto mode.
$ java -version
java version "1.7.0_05"
Java(TM) SE Runtime Environment (build 1.7.0_05-b05)
Java HotSpot(TM) 64-Bit Server VM (build 23.1-b03, mixed mode)

If installation went as planned, you should see the last message when running the JVM to get its version. Next up is some accessory software. First, for our database, we will use the excellent PostgreSQL database. We will also install ImageMagick (for image manipulation), FFMPEG (for transforming video), LibreOffice (for the embedded document engine) and SWFTools (for the pdf2swf utility and previewing PDF files). Installing the last two are a bit tricky, as we have to setup a couple of PPAs to get this to work, which itself requires installing the ability to add PPAs (thus requiring python-software-properties)!

$ sudo apt-get install python-software-properties
$ sudo add-apt-repository ppa:guilhem-fr/swftools
$ sudo add-apt-repository ppa:libreoffice/ppa
$ sudo apt-get update
$ sudo apt-get install postgresql imagemagick ffmpeg swftools libreoffice

Alfresco is a Java-based web application, and needs a Java webserver to run it. The standard is Tomcat. So let’s install and configure that. While you are at it, install the Apache Native Libraries for a little oomph.

$ sudo apt-get install tomcat7
$ sudo service tomcat7 stop
$ sudo apt-get install libtcnative-1
$ sudo service tomcat7 start
 * Starting Tomcat servlet engine tomcat7                    [ OK ]

You can test your Tomcat server by pointing a browser to http://your.server.here:8080/ and you should see the standard Tomcat “It Works!” greeting page.

We are finally ready to install Alfresco. Download the zip file on your budding server, unzip it into a directory (we creatively called ours “alfresco”), and do the following:

$ unzip alfresco.zip -d alfresco
$ sudo cp -r ~/alfresco/web-server/shared /var/lib/tomcat7
$ sudo cp -r ~/alfresco/web-server/webapps /var/lib/tomcat7
$ sudo cp -r ~/alfresco/web-server/lib /var/lib/tomcat7/shared/lib
$ sudo cp -r ~/alfresco/bin /var/lib/tomcat7/bin
$ sudo cp -r ~/alfresco/licenses /var/lib/tomcat7/licenses
$ sudo cp -r ~/alfresco/README.txt /var/lib/tomcat7/README.txt
$ sudo mv /var/lib/tomcat7/shared/classes/alfresco-global.properties.sample /var/lib/tomcat7/shared/classes/alfresco-global.properties
$ sudo mv /var/lib/tomcat7/shared/classes/alfresco/web-extension/share-config-custom.xml.sample /var/lib/tomcat7/shared/classes/alfresco/web-extension/share-config-custom.xml

Now, we create a PostgreSQL database for Alfresco to use. If the last line executes without error, you’re doing fine.

$ sudo mkdir /opt/alfresco
$ sudo chown -R tomcat7:tomcat7 /var/lib/tomcat7 /opt/alfresco
$ sudo -u postgres createuser
Enter name of role to add: alfresco
Shall the new role be a superuser? (y/n) n
Shall the new role be allowed to create databases? (y/n) n
Shall the new role be allowed to create more new roles? (y/n) n
$ sudo -u postgres createdb alfresco
$ sudo -u postgres psql
psql (9.1.4)
Type "help" for help.
postgres=# alter user alfresco with encrypted password '!QAZ1qaz';
ALTER ROLE
postgres=# grant all privileges on database alfresco to alfresco;
GRANT
postgres=# \q
$ psql -h localhost alfresco alfresco

You’ll have to now get dirty and delve into various settings and configurations. Here’s the list:

  • /var/lib/tomcat7/conf/catalina.properties
    • shared.loader=${catalina.home}/shared/classes,${catalina.home}/shared/*.jar,/var/lib/tomcat7/shared/classes,/var/lib/tomcat7/shared/lib/*.jar
  • /var/lib/tomcat7/conf/catalina.properties
    • JAVA_HOME=/usr/lib/jvm/jdk1.7.0
    • JAVA_OPTS="-Djava.awt.headless=true -Xmx768m -XX:+UseConcMarkSweepGC"
    • JAVA_OPTS="${JAVA_OPTS} -XX:MaxPermSize=512m -Xms128m -Dalfresco.home=/opt/alfresco -Dcom.sun.management.jmxremote"
    • JAVA_OPTS="${JAVA_OPTS} -XX:+CMSIncrementalMode"
  • /var/lib/tomcat7/shared/classes/alfresco-global.properties
    • Change all settings to match your setup.
  • /etc/postgresql/9.1/main/pg_hba.conf
    • If you want to allow different access. Check online for more help.

Now, from any browser, you can log in via http://your.server.here:8080/alfresco or http://your.server.here:8080/share and begin managing your content in an enterprisey way.

An Nginx Configuration For Jenkins

Lots of people have posted configurations of Nginx to allow effective proxying of Jenkins when they are both on the same server, but for some reason, it seems that having them on different servers doesn’t seem as commonly discussed. I am using Nginx in my SOHO network to front a few virtual servers, and provide them all via the few IPs I have on my Comcast Business Class connection. That means having a proxy that can serve up the various systems supporting various domains.

We’ve covered how to build a Jenkins server, so for the sake of documenting this additional capability, here’s my configuration:

server {
  listen 80;
  server_name jenkins.domain.com;

  access_log /var/log/nginx/jenkins_access.log main buffer=32k;
  error_log /var/log/nginx/jenkins_error.log;

  rewrite /jenkins/(.*) /$1 last;

  location / {
    proxy_pass       http://192.168.1.115:8080/jenkins/;
    proxy_redirect   off;
    proxy_set_header Host $http_host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

    # max upload size
    client_max_body_size       20m;
    client_body_buffer_size    128k;
    proxy_connect_timeout      90;
    proxy_send_timeout         90;
    proxy_read_timeout         90;
    proxy_buffer_size          4k;
    proxy_buffers              4 32k;
    proxy_busy_buffers_size    64k;
    proxy_temp_file_write_size 64k;
  }

}

A Starter Ubuntu 12.10 Server

Here’s my frequent steps:

  1. Install Ubuntu 12.04 Server (x64) on the virtual or physical machine. When stepping through the install, make sure to install the OpenSSH server if you will want/need to remote in for working with the system.
  2. Once done, log in and bring the server up-to-date:

    $ sudo apt-get update
    $ sudo apt-get upgrade
    $ sudo apt-get dist-upgrade
  3. I like to install etckeeper to help track config changes, so do a:

    $ sudo apt-get install git etckeeper chkconfig

    Correct the default configuration in /etc/etckeeper/etckeeper.conf to work with git:

    $ sudo nano /etc/etckeeper/etckeeper.conf

    Uncomment the git line, and comment out the bzr line. Save, then fire it up:

    $ sudo etckeeper init
    $ sudo etckeeper commit "Baseline"

    Now etckeeper has set up a cronjob that will run daily and auto-commit any changes to files in or under the /etc directory.
  4. For a server, you’ll likely want to give it a static IP, instead of the default DHCP that installation sets up. So, we’ll edit /etc/network/interfaces. We’re going to use an IP of 192.168.1.112 as an example, but you should change it to whatever makes sense for your network.

    $ sudo nano /etc/network/interfaces

    Assuming your main NIC ended up as eth0, change:
    iface eth0 inet dhcp

    To:

    iface eth0 inet static
    address 192.168.1.112
    netmask 255.255.255.0
    network 192.168.1.0
    broadcast 192.168.1.255
    gateway 192.168.1.1
    dns-nameservers 192.168.1.100 208.67.222.222
    dns-search domain.local

    Of course, change the settings to what makes sense for your LAN. Also, make sure to add settings for the listing of the right DNS servers. You will find that, as it was for me, the way nameservers are handled suprisingly changed in 12.10. For a static IP, where you disable DHCP lookups, this means your resolv.conf file will be blank at every boot. The easiest option is to add the “dns-” prefixed lines to your interface configuration as shown above.

  5. Finally, restart your networking:

    $ sudo /etc/init.d/networking restart

Enjoy your clean starter Ubuntu server!

Continuous Integration With Jenkins On Ubuntu 11.10

First, install Ubuntu Server 11.10. Obviously, settings will vary from machine to machine, but when you get to the page for selecting software to be installed, make sure you select both the OpenSSH server and the Tomcat server.

Ubuntu Software Selections

With a fresh server install, you’ll want to assign a static IP to your server. Ubuntu Server 11.10 will likely detect your network card, and set it up during install to use DHCP. But, it makes more sense for a server to have a stable IP. You can change this in /etc/network/interfaces. Change the section that likely reads as:

iface eth0 inet dhcp

to something like:

iface eth0 inet static
  address 192.168.x.x
  netmask 255.255.255.0
  gateway 192.168.x.1

Of course, use whatever local LAN network addresses make sense for you. Either restart the network service (sudo /etc/init.d/networking start) or reboot.

When you’ve rebooted, make sure to update Ubuntu itself.

$ sudo apt-get update
$ sudo apt-get upgrade
$ sudo reboot

Jenkins is a Java app that needs some environment to run it. We’ve already installed Tomcat for this through the Ubuntu installer. You can verify it is running by surfing to: http://%5Byour IP address]:8080. You may also want to configure http://%5Byour IP address]:8080/manager/html. Surfing over to that page will give you the info needed to configure the status page viewer when you fail login on the attempt on the new Tomcat server. The other reason is that this management page allows you to easily deploy the Jenkins WAR too. Download the WAR for the Ubuntu distribution and upload it via the Tomcat manager app.

If you now surf over to http://%5Byour IP address]:8080/jenkins, you will see Jenkins, but in an error state. It will complain that it is “Unable to create the home directory ‘/usr/share/tomcat6/.jenkins’. This is most likely a permission problem.”. Well, at least Jenkins is running! The easy way to solve this is to let Tomcat have access to that folder.

$ cd /usr/share/tomcat6
$ sudo mkdir .jenkins
$ sudo chown tomcat6:nogroup .jenkins
$ sudo /etc/init.d/tomcat6 restart

That should get you going on your adventure in continuous integration with Jenkins.

Fedora 16 Minimal Install And No Networking

Hopefully, this helps some people quickly shortcut to a solution instead of putzing around for a couple hours wondering “Why, O Why?”. If you install Fedora 16 with a minimal install, like I recently did, you will find out that while you can maybe ping your local systems, you cannot get out on the net.

Apparently, “minimal installation” to Fedora seems to really mean barest of bones. This is actually an old annoyance. The key issue seems to be that the network configuration script is either missing or badly formatted. In my case, Anaconda asked me for some eth0 settings, and I supplied a static IP on the local 192.168.x.x LAN, which worked fine to pull necessary packages during install. However, the resultant ifcfg-eth0 was incorrect:

[root@registeel ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth0
HWADDR="6A:27:C0:31:4A:B0"
DOMAIN="mcs.local"
IPV6INIT="no"
UUID="5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03"
IPADDR0="192.168.1.115"
DNS1="192.168.1.100"
PREFIX0="24"
DEFROUTE="yes"
IPV4_FAILURE_FATAL="yes"
NM_CONTROLLED="yes"
BOOTPROTO="none"
GATEWAY0="192.168.1.1"
DEVICE="eth0"
TYPE="Ethernet"
ONBOOT="yes"
NAME="eth0"

The issue is all the trailing zeros on two of the keys, IPADDR0 and GATEWAY0. Just remove the trailing zeros, then restart the network service or reboot, and all should be well.

Enjoy!

Concatenate Aggregate Function For SQL Server

Tired of the fact that, still after many years, there isn’t a convenient and built-in way to concatenate strings in an aggegrating query. Well, wait no more!

First, check if your database environment is setup to allow CLR user-defined functions:

SELECT * FROM sys.configurations WHERE name = 'clr enabled'

If value_in_use = 1, you are setup for it. If not, you can turn it on yourself via:

sp_configure 'clr enabled', 1;
GO
RECONFIGURE;
GO

Now, fire up VS2010. Create a Visual C# SQL CLR Database Project from the SQL Server database templates. Right-click on the project in the solution and click on “Add New Item”. From the list, select “Aggregate” (and note all the other goodies you could create). I called the solution/project `ConcatMaxNullable`.

There is plenty of good information on what the various methods stubs are doing. Also, there are some hard-to-find example of concatenate aggregates out there too. Unfortunately, none of them fit my needs. I need one that:

  • Concatenated strings. (Duh!)
  • Could output to the new nvarchar(max) type and not be limited by the 8000 chars of the old nvarchar().
  • Allowed skipping nulls, or replace them with empty strings still delimited.

So, I rolled my own. I’m going to go ahead and simply post my own code for others to use. Should be self-explanatory as the methods don’t really do anything particularly novel.

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Text;

[Serializable]
[Microsoft.SqlServer.Server.SqlUserDefinedAggregate(
  Format.UserDefined,
  IsInvariantToOrder = false,
  IsInvariantToNulls = true,
  IsInvariantToDuplicates = false,
  MaxByteSize = -1
  )]
public struct ConcatMaxNullable : IBinarySerialize
{
  private StringBuilder _accumulator;
  private string _delimiter;

  public Boolean IsNull { get; private set; }

  public void Init()
  {
    _accumulator = new StringBuilder();
    _delimiter = string.Empty;
    this.IsNull = true;
  }

  public void Accumulate(SqlChars item, SqlString delimiter, SqlBoolean skipNulls)
  {
    // if value is null, return if skipNulls is true
    if (item.IsNull && skipNulls.IsTrue)
      return;

    // if we have an actual delimiter
    if ((!delimiter.IsNull) && (delimiter.Value.Length > 0))
    {
      _delimiter = delimiter.Value;
      // accumulate delimiter if we have something already
      if (_accumulator.Length > 0)
        _accumulator.Append(delimiter.Value);
    }

    // if value is null, just add delimiter (above) and return
    if (item.IsNull)
      return;
    else
    {
      _accumulator.Append(item.Value);
      this.IsNull = false;
    }

  }

  public void Merge(ConcatMaxNullable Group)
  {

    if (_accumulator.Length > 0 && Group._accumulator.Length > 0)
      _accumulator.Append(_delimiter);

    _accumulator.Append(Group._accumulator.ToString());

  }

  public SqlChars Terminate()
  {

    return new SqlChars(_accumulator.ToString());

  }

  void IBinarySerialize.Read(System.IO.BinaryReader r)
  {

    _delimiter = r.ReadString();
    _accumulator = new StringBuilder(r.ReadString());

    if (_accumulator.Length != 0)
      this.IsNull = false;

  }

  void IBinarySerialize.Write(System.IO.BinaryWriter w)
  {

    w.Write(_delimiter);
    w.Write(_accumulator.ToString());

  }

}

Once you pasted this in, compile and then get the DLL somewhere the server can see it. Installing it is as easy as running these two little scripts:

CREATE ASSEMBLY ConcatMaxNullable FROM 'C:\\ConcatMaxNullable.dll'
WITH PERMISSION_SET=SAFE
GO

CREATE AGGREGATE [dbo].[ConcatMaxNullable]
(@item [nvarchar](max), @delimiter [nvarchar](8), @skipNulls bit)
RETURNS[nvarchar](max)
EXTERNAL NAME [ConcatMaxNullable].[ConcatMaxNullable]
GO

Then, you can easily do something like this query, which shows you your foreign keys, and the columns involved, in a comma-separated format ready for scripting.

SELECT
  c.[TABLE_NAME], k.[CONSTRAINT_NAME], dbo.ConcatMaxNullable([COLUMN_NAME],',',1) AS [Cols] 
FROM
  INFORMATION_SCHEMA.TABLE_CONSTRAINTS c
INNER JOIN
  INFORMATION_SCHEMA.KEY_COLUMN_USAGE k ON c.[CONSTRAINT_NAME] = k.[CONSTRAINT_NAME] 
WHERE
  c.[CONSTRAINT_TYPE] = 'FOREIGN KEY'
GROUP BY
  c.[TABLE_NAME], k.[CONSTRAINT_NAME]
ORDER BY
  c.[TABLE_NAME], k.[CONSTRAINT_NAME] 

Enjoy!

Architecting A C# Web-Based Application: General Concepts

So, we’ve loosely tossed around our greenfield web application project in our head, and we’ve decided that we’re going to go ahead and develop it. The question then becomes “What’s next?”

At this point, the temptation is to just jump in and start coding, especially if the project is personal (not derived from one or more external stakeholders) and scratches a big, immediate itch. In my case, I have been trying to find a good PM solution that doesn’t burden me if umpteen keypresses to log tickets and has a real ability to workflow items. Many do not fit the bill, and my workload isn’t getting any leaner.

But, instead of jumping right in, we’ll pause for just a little while, and gather up some key concepts we want to put in play in this project. There are a lot of choices to make when you develop a web application. There are concerns over technologies (ex: databasing both relational or KV store, code repositories, etc.), libraries (ex: ORMs, serializers, etc.), platforms (ex: ASP.NET MVC 2), frameworks (ex: jQuery, Sharp Architecture, etc.), development methodologies (ex: TDD, BDD), management approaches (ex: Scrum, XP), architecture concepts and patterns (ex: CQRS, DDD), amongst many others. While each one of these items, like CQRS, is often the subject of multiple blog posts, but I will attempt to cover the salient items with respect to this project within a couple of posts.

In this project, we will cover:

  • Domain-Driven Design (DDD): Domain-driven design has been around for a decade or two, but has really taken off in the last few years, as more developers start tackling software of growing complexity. DDD is a methodology (as in “a framework that is used to structure, plan, and control the process of developing an information system”) that addresses the broad topic of researching, understanding and then designing the conceptual part of whole systems. DDD makes the developer focus on the core functionality by isolating it into a domain model, separate from other concerns, and also by bringing the developer closer to the business user’s language. It does this through the core concepts of “ubiquitous language” and “bounded contexts”, which we will discuss in a later article. The goal of DDD is to create a set of techniques, concepts, patterns and language that directs you to focus on the domain, on the concepts of the system, rather than on the underlying technologies. Said differently, if you want to produce good software, it is more critical that you understand what an Order is and does logically, rather than decide which NoSQL store du jour to use. The latter is often more exciting to technologists, and is also a case of putting the cart before the horse. DDD will help not only build better software through better focus, but also help us do simple things like give us direction in project structure.
  • Command Query Responsibility Segregation (CQRS): This is actually a simple architectural concept or pattern, rather than a comprehensive, technology-specific framework or methodology. The goal is implied in the name itself. The idea is that commands/actions that tell a system what to do and that change the state of the system are segregated, at least logically, from the act of querying the state of the system. Reads and writes frequently contend in busy systems. CQRS is an approach to mitigate that reality of system design. Again, we’ll explore this in its own article soon.
  • Event-Based Architecture (EDA): The ever-present question in any developer’s mind at work is “Where should this block of code go? How should I organize my code?” We want the end-product of our craft to be easy to write, easy to read and easy to change. These needs are encapsulated by a lot of acronyms: DRY, GRASP, SOLID, etc. In very general terms, the main goal is low coupling (low dependencies between functional sections of code, classes or otherwise) and high cohesion (breaking down code into functional sections that are very focused in purpose). Of course, as you break down your code into focused chunks that are independent of each other, the question becomes how do you get them to work together? In comes messaging. These blocks of code coordinate and affect each other via messages. One block of code raises an event (a message) that it did its thing and changed the system, and then interested listeners pick up the message and do what they were created to do in response.
  • Dependency Injection (DI): When we create blocks of independent code, oftentimes there is the need (or temptation) to have one block use another block directly. And so, we pull a direct reference or link to that code; we new up what we need. In so doing, we have increasing the amount of coupling in the system. DI is a way to reduce this coupling. For example, if we implement an EDA-based system, almost every block of code needs to publish their event messages into a system that can then distribute it out to interested listeners. We don’t want to have that code in multiple places; that breaks DRY. We also don’t want to put that code in its own block, and then have every other block link to it; that ruins low coupling. Instead, we use the DI pattern. This allows us to register the event system, and its various parts, in a container or directory that any other part of the system can see and use. That code doesn’t get repeated and the indirect nature of the link allows looser coupling. So, when one block of code needs an event publisher in our EDA system, it calls for one generically (“Hey I need an object that has a way to let me publish an event into the message bus!”) and gets whatever is registered in the system (“Here’s an concrete object for you to do this. It has the method you need.”). Basically, you let a specific part of the system focus on managing dependencies, instead of the immediate code doing it for itself. That makes it easy to change parts. Is your custom-built publish-subscribe code not robust enough? Well, plug in NServiceBus. Built right, with the blocks of code offering up the same interface to achieve the same functions, you should be able to swap systems out.
  • Aspect-Oriented Programming (AOP): AOP is a programming paradigm, a style of coding. The keyword, aspect, describes a focused functional block of code that has a high amount of reuse throughout a system. Aspects are these blocks that are cross-cutting concerns because they “cut across” (a.k.a. “are used in”) many different blocks of unrelated code in the system. A classic example of an aspect is the need for a logging subsystem in an application to support debugging efforts. In a way, whereas DI is a passive way to allow one block of code to use another, AOP is much more active and/or broad-stroked. AOP prescribes a way to apply a block of code (a.k.a. advice, ex: “run this aspect before the block”) across some or all blocks of code (a.k.a. cut points, ex: “all methods in namespace X.Y.Z”). Aspects are a great way to keep such code in one place for maintainability, but effectively apply it where necessary with low cohesion, since the affected block is effectively oblivious to its presence.

We’ll explore a lot of these concepts in detail in subsequent articles. And, I reserve the right to expand this overview list as I discover more topics that deserve an “executive summary” for those fresh to the series. For example, I have not covered testing, continuous integration, and other more technical items that add to our ability to deliver good software. If you think we should cover anything else, feel free to chime in.