Content management in a box 15 September 2009
Posted by lopataru in ECM, Research.Tags: content management, ECM, xDB
add a comment
“Can I have two of those to go, please?”
A recent announcement from Oracle talks about an OLTP database machine. I’ll let you read the details and other comments in the official announcement and blogosphere.
When I received this pre-announcement over the weekend I appreciated the synergy between the two product lines: RDBMS and server. The RDBMS runs on a server.. why not make a specially tuned RDBMS to run on a specific hardware and also tune the hardware to generate a whooping performance for that specific software? While I’m not sure the new Oracle product does all this, I can imagine it.
Now, back to our nice little ECM world. CM software is captive to the RDBMS. Its performance depends on it. The licensing goes hand in hand… You rarely (if ever) can use a major ECM suite without a properly setup RDBMS. Why is that? Well, I can think of several reasons like ease of deployment, portability, reasonable performance, time-to-market… but the question still remains: “Why not have a CM server?” One box to deliver it all. A CM “appliance”. An “Apple CM”… all in one box, no replaceable battery.
As I know EMC products quite well, it’s obvious this would be a very nice use case for xDB. Let’s see if the R&D can pull it off – I would do it until end of 2010 if I was EMC and release it in 2011. I could really use a Documentum package which does not need a DB license/product and runs at least acceptable if not better.
Back to the “box” idea (I really like the Apple analogy) I’m not necessarily talking here about the “no database CMs” (like the list here). I’m talking about a full fledged, powerful and highly performance CM which is “in tune” with its medatata storage (based on a RDBMS or not….).
I’m pretty sure somebody already has this in their lab or even shop. I have a PhD thesis which is almost on this, and I’m probably not the most innovative guy in the world. I would love to learn about any such initiatives, but I’m too lazy today to search for it today… that’s another to do post-it.
It is being said that crisis times are the best drivers for innovation. Really?
My content management beginnings 9 September 2009
Posted by lopataru in ECM, Various.Tags: beginnings, ECM
1 comment so far
On my usual blog surf I’ve come to a memory lane post from Pie talking on first CM apps.
I now realize I was doing CM stuff since about ‘95. At that time I did not know what content management was (anyway, Wiki says “E”CM was coined in 2000).I was just building applications which managed semi-structured text documents, searched them in metadata and content, presented them to users in intranet and on the web… etc.
My first one was a legal documentation system which managed all the laws and some jurisprudence in my country. That summed up to about 100.000 documents which needed to be fulltext indexed, formatted in hypertext, presented, linked, updated daily… the works. We even won some awards on that
The first moment when I heard the term “Content Management” was when I worked for an European Union project to provide a distributed documentation system to a national network of citizen advice services. Then, a consultant from UK told me: “hey, you are building a content management system”. I nodded my head and carried on… had no idea what he actually meant. It was about ‘99.. I think.
All went along until 2004 when I met head-on Alchemy, Captiva, Legato and Documentum (all pre-EMC). I still remember the feeling when i first opened a VM with Documentum on it and trying to find out what to click to get to the juice. And I was definitely hooked…
My first Documentum app was built with dmbasic and workflow. Pretty powerful solution, done without any training and which worked several years daily… oh… those were the days…
Future of Content Management – another blog post 15 August 2009
Posted by lopataru in ECM.Tags: ECM, vision
add a comment
Why is everybody (eg: one, two, three, four (old), five (older)… me too) talking these days (and past) about the future of xCM?
Because we might feel disappointed with the current having and need to look to a positive future?
I felt the urge to reply to the recent posts in my reach but I thought I’ll write here instead.
Everybody is right. (that was easy.. heh)
But… Lee touched a very important phenomenom.
xCM needs to be simple and beautifully executed in order to succeed. We aim to make it omnipresent/universal/almighty.
This can be achieved only if it’s simple (to us). Like google search was. Like email. Like web. Like databases. And beautifully executed. Like google, like email, like web, like databases.
We need to work very hard to do this. And we need to have a lucky idea which will take little time to do and then get picked up by millions of followers (twitted lately?). This is my dream.
/dream
Educate people. Research new ways. Invent. Enforce excellency. Leave back the failures.
Get to work (this was for myself
)
Summer ECM review 27 July 2009
Posted by lopataru in ECM.Tags: boring, ECM, vision
add a comment
Here I am in the middle of summer, looking forward to a small holiday. And while I’m at this and waiting for an email to arrive from one of my colleagues… I’m thinking to make a quick review of the ECM technologies I’ve used and seen lately.
The first thing which comes to my mind is that I almost didn’t see an ECM project where a vendor product can be used out of the box with only configurations to be done.
Personally I use (and like) SharePoint for storing documents in some scenarios. It works great as a replacement for file shares – on electronic documents. But that’s it. For almost anything else in this area you need to call in a developer. Because this is what SharePoint 2007 is right now: a development platform.
Now, take EMC Documentum. I’ll just not mention the end user products. Everything I’ve seen is not suited for ootb use in ECM scenarios – you simply pay too much for a set of functionalities which actually stand in your way when you want to do something. In almost all (90%) of my implementations we needed to develop on the platform in order to meed the business needs.
Don’t take me wrong… I don’t say this is not normal. It’s just that I strongly believe we need to change this.
On IBM FileNet… the same story. Different API, different limitations… same need for a developer. And IBM CM is definetely an “only developers” area.
And this is where my experience ends. Why? Because I didn’t find the time to play seriously with others (Alfresco, OpenText, Nuxeo… these are on my todo list)
So.. which is better? All. And none. It costs roughly the same to do the same ECM business requirements on all of these platforms. The difference comes from the other related services and activities (like installed base, integration with other products, skillset of existing IT at the vendor/partner/customer…).
So.. what’s next in these cloudy days? hehehe
No… not the coud. That’s old news. And anyway, as fellow bloggers said… it’s the same Mary with a different hat. It’s a way IT people find new ways to respond to business challenges while reusing the same technology. Not boring but dull.
I have the feeling that the big innovation must be already here. Buried somewhere in a garage/apartment company. In the brain of some enthusiasts which think outside the box. Where are you?
Content management is not easy. Especially when you need to take care of big organization inertia. And when you need to solve a problem “yesterday”, not “next year”. You can hardly keep innovating in this conditions. This is why big vendors probably can’t do it (reminds me of Virgin & “BA can’t get it up” stuff).
Anyway… i’ll just present my PhD thesis and go home. To my ECM projects on old and still rushed-to-market products. With vendor support which is unable to truly solve my issue. To escalation meetings where everybody tries to blame others…
Maybe after the holiday I’ll see the ECM vision. Somewehere between a jacuzzy and a glass of wine. Wish me luck!
cloud usage 16 June 2009
Posted by lopataru in ECM.Tags: cloud services, ECM
add a comment
This is more like a microblog entry…
I’m thinking of a way to use Cloud services with traditional ECM.
Storage ease of access (large enterprises move slowly with aquisitions), disaster recovery scenarios and content accessibility to distributed users.
These come to me as nice benefits.
I’ll research how these fit with the ECM vendors strategy. Is this interesting during these times? Or it’s just a nice gizmo?
Hmmmm…
Open Source and ECM 26 May 2009
Posted by lopataru in ECM.Tags: ECM, open source
3 comments
I and my team do ECM implementations using tehnology from most of the guys in Gartner’s leaders quandrant. EMC, IBM (both) and MS (cough!).
I can’t say I’m particularly happy with the products they have, but in general those products get us quicker to meet the customer needs. Which needs are not always purely centered on “content” or pure BPM. Content might be just passing through the picture so the solution gets to be tagged “ecm”. The “E” being included because the customer is a big organization, not necessarily because the solution addressed the whole company.
We always strive to get the customer buy the whole ECM concept, since we would really like to to it in most cases. Sometimes we succeed (about 3-4 times a year)… sometimes we don’t… and we get to stick to the “departmental level app on top of a much more powerful platform”.
Every now and then we play with the idea “what if we would build our own ECM?”… Yeah… i know. That’s a subject for another entire blog and forum… hehehe
So we live a normal life inside this space.
We got used to take SharePoint into almost any discussions with a new customer… Anyway, I have used SP in some very nice solutions so… it’s absolutely fine.
These days something strange happened. A private company customer (not a public organization) told us it’s thinking of building its solution on open source. This is the first time I heard a CIO level of a large enteprise discussing the possibility of using Open Source for one of its platforms. I’ve heard this from public / state-owned organizations. But not from a Top100 private company.
I like the idea. The “techie” part in me is thrilled to it. But the “manager” & “real life solution provider” gnomes inside my head start to nod.
I know Alfresco and Nuxeo (just to name some) have nice products and probably a very successful installation base but I have rarely seen them in this area. Maybe the market I’m in is not into it… yet. Who knows.
I’m not judging the technical capabilities. On a lot of items I know already these vendors outperform the “traditional” ones.
I’m worried about the ecosystem around them. About the technical skills to manage installations. About roadmap predictability. About the needed culture change in IT in order to work with it.
And I’m not counting on enthusiasts who can make anything work. I’m thinking of the average IT Joe who needs to be a service manager for such a system. I blogged about difference between Windows and Unix administration tasks. I blogged about the decrease in IT quality. These things count on the “Enterprise” level. You can’t never have enough enthusiasts there (strangely enough, such enthusiasts exist more in the public sector).
At least this is my view, right now. After a beer and looking forward for some sleep.
And to end like in famous tv shows, with some questions:
How is Open Source used in Enterprise level (from the beneficiarry point of view)?
Is it different to sell and implement from the vendor / isv / var point of view?
Document management with SQL Server 20 March 2009
Posted by lopataru in ECM, Research.Tags: document management, ECM, sql server, Sql Server 2008
add a comment
This is a placeholder post, I’ll update it as time goes by.
Currently I’m building a presentation to show to the IT community how SQL Server can be used to build Document Management systems.
I have built (me and my team) many applications on SQL Server and several for DM. So i need to structure my experience a bit and give back to the community while researching what anyone else did similar and what the new version of SQL 2008 brings to the table.
If you whish to share your thought, feel free..
later edit:
Of course I could not update the post as I researched…. but here are the outcomes:
Main topics of interest when trying to build a DMS solution on top of SQL 2008:
- Integrated Fulltext Search
- FILESTREAM data
- Remote Blob Store (RBS)
Other significant SQL Server 2008 functionalities:
- Backup compression
- Data compression
- Data encryption
- New DATE/TIME field (UTC)
- Improved XML processing (with Lax validation)
- Improved reporting services (who doesn’t need reports ?
) - last, but not least: Sparse Columns
- more here
Full Text search
Now being integrated (and rewritten), the FTS engine provides more functions to the user and developer. The performance is kept somehow like in 2005 but some areas show significant improvements.
Fot the brave enough to use FTS in 2005 and previous versions, the migration options need to be considered (3 in total: rebuild, import, reset). Rebuild is needed especially if you want to take advantage of the new stemming and word-breaking rules and languages.
Nice things: stop words are now in the database. So they are accesible, programmable and transportable. They are also not only language dependent but you can also define other “set building” rules.
The thesaurus is still in XML but now is lazy cached and can be updated without restarting the server (yey!). Note that it behaves a little different then in 2005. So you need to take care when migrating your XML files.
Cool stuff: troubleshooting functions! Something always needed to look into the FT “magic”. baing able to see what keywords were indexed for a particular document / collection is very nice. To see it from SQL is even nicer. To be able to see how a query is parsed and transformed is great. I’m also happy since I can see how the stemmer and thesaurus work for a particular case.
Some advice: take care if you have many keywords (x 10 million). Use fast disks, IO is very important. Use 64 bits: 3 GB of RAM is usually not enough. Don’t confuse FREETEXT and CONTAINS, use them wisely.
BLOB related news
First of all, please don’t use IMAGE and TEXT/NTEXT fields anymore. They will no longer be supported / encouraged by Microsoft.
You can use VARBINARY(MAX), but you hit the 2 GB limit with it. Use the FILESTREAM modifier (new in 2008) to kill that limit.
FILESTREAM makes content to be stored in the NTFS drive. Nice. And tricky at the same time. Good for streaming, not so good for frequent updates. Good for big files, not so good for many files (especially when having short backup windows).
Nice: works from TSQL as well as Win32. Not so nice: behaves a little differently in TSQL vs. Win32 (transaction isolation level, performance – not necessarly better in Win32).
So, you really have to understand it before using. You can get in some not so obvious pitfalls. But is a good thing.
Remote Blob Store – RBS
Who does not know what CAS (Content Addressable Storage) is probably does not need it.
Is not another column type, it’s an API to be implemented by CAS vendors mainly and used by applications.
Somehow, it’s similar with EBS on SharePoint. In fact, there is a competition between the two (some nice cover is here), and I also feel that RBS is the way to go (regardless of the current limitation about accesing the context of the Blob).
EMC already has a RBS connector for Centera. Nice.
So, 2008 brings a lot of nice things on the table. Let me know when you use them.
Is traditional ECM up for high performance 9 March 2009
Posted by lopataru in ECM.Tags: ECM, performance
2 comments
Last week I stumbled over the WordPress statistics for February.
Impressive. That got me thinking on how I would be able to implement this kind of backend system with a traditional (read “top Gartner stuff”). And I shivered inside while thinking of IBM CM, Documentum, FileNet… SharePoint (lol!).
I remember once I’ve seen a support issue with one of the above vendors in which the administration tool could not display the size of filestores bigger that 2 GB. And I believe the issue is still there, after 2-3 years. Tragic.
Imagine a customer seeing this (and they do) and saying… “Well.. what kind of Enterprise system is this? If it cannot show correctly storage spaces over 2 GB? How can I trust it with my Terabytes?”
My practice based experience tells me that a large scale performance cannot be normally achieved with Enterprise grade software. You have a better chance with some high skilled professionals (not many, 4-7 should be enough) which can put together some “indie” software.
In my work, I rarely (read “never”) seen a ECM system handle a load similar with the WordPress one. ECM in my area tends to cap normally at about several million items and a few TB of space. While requiring a huge deluge of hardware (I can’t still understand why on earth would I need minimmum 4 GB or RAM for an Index Server? Even in the smallest install)
You out there…. Working on ECM… I would like to know your statistics.
Documentum 6.5 ramblings, or something 2 March 2009
Posted by lopataru in ECM.Tags: documentum
add a comment
I just noticed that another month passed by without me writing anything here.
Obviously because I’m busy… same (lame) excuse. Not really. Of course the job is always demanding, but getting 30 minutes a day to ramble about something can’t be that hard.
So I pulled myself and logged back here. Let’s talk about my recent Documentum experiences.
At this stage we are undergoing some D 6.5 implementations. SP1, just to be sure.
I’m not involved technically first hand, and the last time I got my hands on it was about 4-5 months ago. This is to settle expectations straight… you’re not about to see any technical revelations here.
First thing: “shit! is different!” the whole install process I mean. Where’s my DocApp? No more docapps, use the “.dar”. Learn Eclipse… headless.. that is. The Eclipse, not me. Not yet.
Ok, basic stuff works. TaskSpace… almost ok. Let’s move to the Imaging Services… Brr… Strike 1… Strike 2… Use Webtop.
I promise myself I’ll get back on this. Never did. And now my colleagues suffer the same things. Poor them.
But it moves faster. Lot faster. Then I increased the RAM available to the virtual machine. Shouldn’t have done that.
Note to self: don’t increase the memory for virtual machines upon the exhaustion of the host memory. Swapping is bad..
Business Process Manager now. Done. All ok. BP Services? Not yet. Later.
Oooo… BAM. I mean Business Activity Monitor. I feel like Dee-Dee in the laboratory. Same effect.
Rollback to VM snapshot.
At this point I think: there must be an easy way.
Then I read in the installation manual: “rename the XXX file to YYY file and then run setup.exe” (names were changed to protect the innocent). As a last line on the page. Why the %^&*() do I need to rename a file which is provided in the installation kit? Oh well.. this why we are payed the good bucks.. can only imagine how SAP looks like.Oh, stop, I know that also… tough job.
Wanna try Annotation Services? With 6.5 ? Sure…. bring it on. It worked! That brought the spirit up. For a moment.
Today it does not work anymore. Looking for some LiveCycle piece? Good luck!
Great. Now what? Laugh hysterically and get back on it.
You know, Oracle installation kits are free on the Internet. You know why? Because it takes a skilled person to install it properly. Go figure.
Why am I telling you this? Because next week when all the puzzle pieces will be nicely put toghether I can look here with my colleagues and laugh.
And since you read it all the way up to here: let’s build a HA environment toghether. And put some Document Sciences on it to spice it up.
PhD paper done. Phew! 19 January 2009
Posted by lopataru in ECM, Research.3 comments
Finally, after a long time i have now a complete version of my PhD thesis.
I would like to ellaborate more on this, but after spending 4 days in a mountain cottage secluded in front of my laptop… i simply can’t.
Now i just need to publish some articles and present my creation to the public. Behold