SWAT Blog

Archive for the ‘Technology’ Category

Generate Date Ranges in Oracle PL/SQL

Tuesday, January 28th, 2014

Here is an easy way to generate date ranges in oracle pl/sql:

WITH t AS
  (SELECT to_date('01-OCT-2012') start_date,
    to_date('30-SEP-2013') end_date
  FROM dual
  )
  SELECT add_months(TRUNC(start_date,'mm'),LEVEL      - 1) START_DATE,
    add_months(add_months(TRUNC(start_date,'mm'),LEVEL - 1),1)-1 END_DATE
  FROM t
    CONNECT BY TRUNC(end_date,'mm') >= add_months(TRUNC(start_date,'mm'),LEVEL - 1)

OUTPUT:

START_DATE  END_DATE  
----------- -----------
01-OCT-2012 31-OCT-2012 
01-NOV-2012 30-NOV-2012 
01-DEC-2012 31-DEC-2012 
01-JAN-2013 31-JAN-2013 
01-FEB-2013 28-FEB-2013 
01-MAR-2013 31-MAR-2013 
01-APR-2013 30-APR-2013 
01-MAY-2013 31-MAY-2013 
01-JUN-2013 30-JUN-2013 
01-JUL-2013 31-JUL-2013 
01-AUG-2013 31-AUG-2013 
01-SEP-2013 30-SEP-2013 

 12 rows selected

NFS Server Setup on Ubuntu server and Desktop

Friday, December 28th, 2012

Server side Configuration–

root@server-desktop:~#apt-get install nfs-kernel-server portmap nfs-common nfswatch

 

root@server-desktop:~#mkdir /srv/nfs

 

root@server-desktop:~#chown nobody:nogroup /srv/nfs/

 

root@server-desktop:~#vi /etc/exports

#ADD BELOW LINES IN exports FILE save and exit.(Ip address of the server system)

/srv/nfs 172.17.0.20/255.255.0.0(rw,sync,no_subtree_check)

#Then Run this command on Terminal–


root@server-desktop:~#exportfs -a

#Then restart nfs and portmap services


root@server-desktop:~#/etc/init.d/nfs-kernel-server restart

* Stopping NFS kernel daemon [ OK ]

* Unexporting directories for NFS kernel daemon… [ OK ]

* Exporting directories for NFS kernel daemon… [ OK ]

* Starting NFS kernel daemon [ OK ]


root@server-desktop:~#/etc/init.d/portmap restart

Rather than invoking init scripts through /etc/init.d, use the service(8)

utility, e.g. service portmap restart

Since the script you are attempting to invoke has been converted to an

Upstart job, you may also use the stop(8) and then start(8) utilities,

e.g. stop portmap ; start portmap. The restart(8) utility is also available.

portmap stop/waiting

portmap start/running, process 3784

#To check nfs folder run showmount command–


root@server-desktop:~#showmount -e

#Above command show nfs folder–

Export list for server-desktop:

/srv/nfs 172.17.0.20/255.255.0.0

Ubuntu Client side Configuration–


root@desktop:~#apt-get install nfs-common portmap

#Now make a folder in /mnt.


root@desktop:~#mkdir /mnt/nfs

 


root@desktop:~#mount 172.17.0.20:/srv/nfs /mnt/nfs

#Below command will show all mounted folders.


root@desktop:~#df -h

Fedora 17 Client side Configuration–


[root@administrator ~]#yum -y install nfs-utils

Loaded plugins: fastestmirror, langpacks, presto, refresh-packagekit

Loading mirror speeds from cached hostfile

* fedora: mirror.cse.iitk.ac.in

* rpmfusion-free: mirror.cse.iitk.ac.in

* rpmfusion-free-updates: mirror.cse.iitk.ac.in

* rpmfusion-nonfree: mirror.cse.iitk.ac.in

* rpmfusion-nonfree-updates: mirror.cse.iitk.ac.in

* updates: mirrors.ustc.edu.cn

Resolving Dependencies

–> Running transaction check

—> Package nfs-utils.i686 1:1.2.6-6.fc17 will be installed

–> Processing Dependency: quota for package: 1:nfs-utils-1.2.6-6.fc17.i686

–> Processing Dependency: libnfsidmap.so.0 for package: 1:nfs-utils-1.2.6-6.fc17.i686

–> Processing Dependency: libnfsidmap for package: 1:nfs-utils-1.2.6-6.fc17.i686

–> Running transaction check

—> Package libnfsidmap.i686 0:0.25-3.fc17 will be installed

—> Package quota.i686 1:4.00-5.fc17 will be installed

–> Processing Dependency: quota-nls = 1:4.00-5.fc17 for package: 1:quota-4.00-5.fc17.i686

–> Running transaction check

—> Package quota-nls.noarch 1:4.00-5.fc17 will be installed

–> Finished Dependency Resolution

Dependencies Resolved

=========================================================================

Package Arch Version Repository Size

=========================================================================

Installing:

nfs-utils i686 1:1.2.6-6.fc17 updates 319 k

Installing for dependencies:

libnfsidmap i686 0.25-3.fc17 updates 33 k

quota i686 1:4.00-5.fc17 updates 162 k

quota-nls noarch 1:4.00-5.fc17 updates 74 k

Transaction Summary

=========================================================================

Install 1 Package (+3 Dependent packages)

Total download size: 588 k

Installed size: 2.0 M

Downloading Packages:

(1/4): libnfsidmap-0.25-3.fc17.i686.rpm | 33 kB 00:01

(2/4): nfs-utils-1.2.6-6.fc17.i686.rpm | 319 kB 00:01

(3/4): quota-4.00-5.fc17.i686.rpm | 162 kB 00:02

(4/4): quota-nls-4.00-5.fc17.noarch.rpm | 74 kB 00:01

——————————————————————————–

Total 83 kB/s | 588 kB 00:07

Running Transaction Check

Running Transaction Test

Transaction Test Succeeded

Running Transaction

Installing : 1:quota-nls-4.00-5.fc17.noarch 1/4

Installing : 1:quota-4.00-5.fc17.i686 2/4

Installing : libnfsidmap-0.25-3.fc17.i686 3/4

Installing : 1:nfs-utils-1.2.6-6.fc17.i686 4/4

Verifying : libnfsidmap-0.25-3.fc17.i686 1/4

Verifying : 1:nfs-utils-1.2.6-6.fc17.i686 2/4

Verifying : 1:quota-4.00-5.fc17.i686 3/4

Verifying : 1:quota-nls-4.00-5.fc17.noarch 4/4

Installed:

nfs-utils.i686 1:1.2.6-6.fc17

Dependency Installed:

libnfsidmap.i686 0:0.25-3.fc17

quota.i686 1:4.00-5.fc17

quota-nls.noarch 1:4.00-5.fc17

Complete!


[root@administrator ~]#mount 172.17.0.20:/srv/nfs /mnt/nfs

Enabling the NFS client on Windows 7 Ultimate/Enterprises system:

  1. Select Control Panel.
  2. Select Programs.
  3. Select Programs and Features.
  4. Select Turn Windows Features on or off.
  5. Select Services for NFS.
  6. Select the check box Client for NFS and click OK.

Run Below Command On Command Prompt:-

mount -o //172.17.0.20:/srv/nfs z:

Virtual Scrolling with dGrid – dojo, dGrid, node.js and mongodb

Wednesday, May 9th, 2012

Recently Sitepen has released a beta of light-weight feature-rich data grid component (compatible with dojo 1.7) called dgrid. I am working on upgrading an application that uses dojo 1.4 to dojo 1.7 and was looking for replacing the custom grid solution that I wrote with a better grid systems. My requirements were to replace the pagination with virtual scrolling and server side sorting – dgrid fits the bill. Though I couldn’t find any example of such an implementation and as such I thought of writing an example my self and share with others in this blog post.

The example shows the SnP 500 company list (data taken from wikipedia) in dgrid setup for virtual scrolling and server side sorting.

Technologies I have used are as follows:

  1. Frontend – Dojo 1.7/HTML/JavaScript (dgrid, put-selector and xstlye AMD based modular components)
  2. Services – node.js (Express – for exposing/routing REST based services and Mongoskin – as mongodb driver)
  3. Database – mongodb

The main objective of this post is to highlight how dgrid can be used for virtual scrolling. I have used node.js and mongodb as I find these appropriate for quick prototyping and as such I will not cover details of these technologies in this post. Once the mechanism of REST calls is clear any technology stack can be used to provide REST services.

Project Setup

  1. Create project directory (PROJECT_ROOT)- In this example project directory is ‘dgrid_virtual_scroll’
  2. Create subdirectory ‘PROJECT_ROOT/public/resources/js/lib’ under project directory
  3. Download and extract following under ‘PROJECT_ROOT/public/resources/js/lib':
    1. dojo toolkit 1.7 – http://dojotoolkit.org/download/
    2. dgrid – http://dojofoundation.org/packages/dgrid/#demos
    3. xstlye – https://github.com/kriszyp/xstyle
    4. put-selector – https://github.com/kriszyp/put-selector
  4. Make sure node.js and npm are installed – Installation can be found at http://nodejs.org/#download
  5. Install express and mongoskin packages using npm
    1. navigate to project directory (PROJECT_ROOT)
    2. Run following commands:
      npm install express
      npm install mongoskin
      
  6. Create blank index.html file under PROJECT_ROOT/public directory
  7. Create blank app.js file under PROJECT_ROOT directory

The directory structure should look like the figure below:

Database setup

  1. Install mongodb – http://www.mongodb.org/downloads
  2. Insert data:
    1. download test data file – companies.txt
    2. Start mongodb server by running command:
      mongod
      
    3. import test data file by running following command:
      mongoimport -d snp -c companies companies.txt
      

      The command will import the data in a collection called companies in a database called snp in mongodb.

Create Node.js service application
Open file PROJECT_ROOT/app.js and copy following code:

var http = require('http'),
	express = require('express'),
    mongo = require('mongoskin'),
    server = new mongo.Server('localhost', 27017, {auto_reconnect: true}),
    db = new mongo.Db('snp', server);


 var app = module.exports = express.createServer();
app.configure(function() {
  app.set('views', __dirname + '/views');
  app.use(express.favicon());
  app.use(express.bodyParser());
  app.use(express.cookieParser());
  app.use(express.logger({ format: 'x1b[1m:methodx1b[0m x1b[33m:urlx1b[0m :response-time ms' }))
  app.use(express.methodOverride());
  app.use(express.static(__dirname + '/public'));
});
 

app.get('/rest',  function(req, res) {
	var urlpart = req.url;
	var sortObj ={};
	if(urlpart){
		if(urlpart.indexOf('(')>-1){
			var sortpart=urlpart.split('(')[1].slice(0, -1);
			var sortfield=sortpart.substring(1);
			var sortsign=sortpart.substring(0,1);
			var sortdirection=1;
			if (sortsign==='-') sortdirection=-1;
			sortObj = JSON.parse('{"'+sortfield+'":'+sortdirection+'}');
		}
	}
	console.log(sortObj)
	var start=0;
	var end=10000;	
	var range=req.header('Range');
	if(range!==null && range !==undefined){
		var m=range.split('=')[1].split('-');
	 	start=parseInt(m[0]);
		end=parseInt(m[1]);
	}
	var pagesize=end-start+1;
	db.open(function(dberr, db) {
		if(!dberr){
			db.collection('companies').count(function(err, data){
				res.header('Content-Range',start+'-'+end+'/'+data);
				db.collection('companies').find().sort(sortObj).skip(start).limit(pagesize).toArray(function(err, items){
					res.send(JSON.stringify(items));
					db.close();
				}) ;

			}) ;
		}	
	});
	
	
});
app.get('/',  function(req, res) {
		res.redirect('/index.html');
});
  
if (!module.parent) {
  app.listen(3000);
  console.log('Express server listening on port %d, environment: %s', app.address().port, app.settings.env)
}

Line 55 in above code is the default routing to the index.html page (we’ll come to it shortly) and Line 18 is routing of REST service call from index.html which sends the JSON back to the index.html. The documentation for REST API for JsonRest can be found at http://dojotoolkit.org/reference-guide/1.7/quickstart/rest.html. Thanks are due to Ken Franqueiro for pointing me to the API doc – which I found very useful.

Create index.html Application frontend
Open file PROJECT_ROOT/public/index.html and copy the following code:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
	<head>
		<title>Test JsonRest store</title>
		<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
		<style type="text/css">
			@import "/resources/js/lib/dojo/resources/dojo.css";
			@import "/resources/js/lib/dgrid/css/skins/claro.css";
			@import "/resources/js/lib/dijit/themes/claro/claro.css";		
			#container { width: 960px; margin: auto; }
			#grid{ height:500px; }
		</style>
		<script type="text/javascript" src="/resources/js/lib/dojo/dojo.js" data-dojo-config="async: true"></script>
	</head>
	<body class="claro">
		<div id="container"></div>
	</body>
	<script type="text/javascript">
		require(["dojo/dom", "dgrid/OnDemandGrid", "dgrid/Selection","put-selector/put", 
			"dgrid/Keyboard", "dojo/_base/declare", "dojo/store/JsonRest", 
			"dojo/store/Observable", "dojo/domReady!" ], 
			function(dom, Grid, Selection, put, Keyboard, declare, JsonRest, Observable) {
				function initView(){
					put(dom.byId("container"),"div#grid");
					
					var companyStore = Observable(JsonRest({
						target : "/rest",
						idProperty : "_id"
					}));
					
					var columns = [ {
						label : 'S&P Ticker',
						field : 'ticker',
						sortable : true
					}, {
						label : 'Company Name',
						field : 'company',
						sortable : true
					}, {
						label : 'Industry',
						field : 'industry',
						sortable : true
					}, {
						label : 'Head Office',
						field : 'headoffice',
						sortable : true
					}];
			
					var grid = new (declare([ Grid, Selection, Keyboard ]))({
						store : companyStore,
						getBeforePut : false,
						columns : columns,
						minRowsPerPage : 25,
						loadingMessage: 'Loading data...',
						noDataMessage: 'No data found'
					}, "grid");
			}   		
			initView();
		});
	</script>
</html>

Run application

  1. Make sure mongodb is up and running
  2. Startup node application by running following command:
    cd PROJECT_ROOT
    node app.js
    npm install mongoskin
    
  3. Open http://localhost:3000/ you should see the view similar to following:
  4. Try scrolling down and sorting by clicking on column headers and notice the Mongodb and Node.js console windows how the calls are being made and the results on the browser window

If you have firebug installed you can open firebug and can see various header variables and request parameters that are used during rest calls:

In nutshell The Virtual scrolling and pagination request and response parameter looks like following:

  1. REST request – Range of data is sent in Request Header under Range variable with value like ‘items=24-68′ and sort field name and direction is sent as the part of URL like http://localhost:3000/rest?sort(+ticker) under the braces +/- sign signifies the sort direction and the text is the column name that needs to be sorted
  2. REST response – The data range for the requested range is sent under header variable ‘Content-Range’ the value follows he patter ‘fromIndex-toIndex/totalNumberOfrecords’ in the screenshot you can see 24-68/500 where 24 is start index, 68 is end index and 500 is the total number of records.
  3. I have tried this with over million records it works like a charm.
    DONE!!!

Convert CRLF to LF line endings recursively

Friday, January 20th, 2012

dos2unix utility converts a file from dos to unix format, i.e replaces CRLF line endings to LF line endings.

This utility can be installed on any *nix environment using apt, yum, ports, etc.

To install on Mac:

sudo port -d selfupdate
sudo port install dos2unix

To install on Debian/Ubuntu/Mint:

apt-get install dos2unix

To run this utility recursively in a directory (excluding subdirectories) cd to the directory and then run following command:

find . -name *.* -exec dos2unix {} ;

To run this utility recursively (including subdirectories) cd to the directory and then run following command:

for file in `find . -type f`; do dos2unix $file $file; done

Process Address Space – Code, gvar, BSS, Heap & Stack

Wednesday, March 23rd, 2011

Almost all modern operating systems implement Memory Protection to protect access to a private address allocated to a process by other processes to avoid storage violation. this private space is called Process Address Space.

Process Address Space partitioned into various memory areas or segments based on the functional differences.

Text/Code Segment

This segment, also known as the code segment, holds the executable instructions of a program.

  • execute-only
  • fixed size

Data Segment

The data area contains global and static variables used by the process that are initialized. This segment can be further classified into initialized read-only area and initialized read-write area.

  • Gvar Section
    A global variable that is initialized and stored in the data segment. This section has read/write attributes but cannot be shared among processes running the same program.
  • BSS Section
    This section holds uninitialized data. This data consists of global variables that the system initializes with 0s upon program execution. Another name for this section is the zero-initialized data section.
  • Heap Section
    This is used to grow the linear address space of a process. When a program uses malloc() to obtain dynamic memory, this memory is placed in the heap.The heap area begins at the end of the BSS segment and “grows up” to larger addresses from there. The Heap area is shared by all shared libraries and dynamically loaded modules in a process.

    • read/write
    • variable size
    • dynamic allocation by request

Stack Segment

This contains all the local variables that get allocated. When a function is called, the local variables for that function are pushed onto the stack. As soon as a function ends, the variables associated with the function are popped from the stack. Other information, including return addresses and parameters, is also stored in the stack. The stack is a LIFO structure, typically located in the higher parts of memory. It usually “grows down” with every register, immediate value or stack frame being added to it.

  • read/write, variable size
  • automatic growth/shrinkage

SQL Server Comma Delimited to table

Tuesday, March 1st, 2011
One of the frequently asked questions in a lot of SQL Server forums is how to handle a comma-delimited value passed as a parameter in a stored procedure.
To better illustrate the question, let’s say you have a website wherein you have a page that lets the users select data items which are sent back to the database for further processing in variety of formats, and of all the formats, comma separated is probably the format which developers struggle with most if they do not know how to consume this structure.

Databases do not understand data arranged horizontally, so how to convert the data from CSV (horizontal) to a table (vertical) ?

Fortunately there are a couple of tricks up database’s sleeve.

1. Dynamic SQL Approach
This method involves creating a SQL statement during execution time to evaluate the where condition. But this approach is as ugly as it sounds nasty. Creating SQL statments on the the fly would results in lot of activity in shared pool. i.e. Compilation of statement, execution plan generation and final execution of query. All of these consume precious server resources and hence should avoided if at all possible.
Due to the obvious pit-falls in this approach I am not even going to attempt to show how-to.
2. Common Table Expression (CTE) based Approach
Comman table expressions (CTE) is probably one of the most powerful programming construct introduced in the SQL Server 2005. It can be used in hierarchical queries, special aggregations and various other problem solving. Various uses of CTE are discussed elsewhere in my blog, which can be found using the search button on the blog homepage.
Below is a CTE based implementation. The good aspect about this approach is that it is completely object independent so the function can be used for any table in any schema of the database as long as you have sufficient permissions.

USE [AdventureWorks]
GO
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N‘[dbo].[ufnCSVToTable]’)
AND type in (N‘FN’, N‘IF’, N‘TF’, N‘FS’, N‘FT’))
DROP FUNCTION [dbo].[ufnCSVToTable]
GO
CREATE FUNCTION [dbo].[ufnCSVToTable](@str VARCHAR(4000), @sep CHAR(1))
RETURNS @table TABLE
( [colx] VARCHAR(4000) NULL )
AS
BEGIN
WITH r0 AS
(
SELECT 1 n,1 m
UNION ALL
SELECT n+1, m+ CASE WHEN SUBSTRING(@str,n,1)=@sep THEN 1 ELSE 0 END
FROM r0 WHERE n<=LEN(@str)
)
INSERT INTO @table
SELECT SUBSTRING(@str,MIN(n), MAX(n)-MIN(n)) colx
FROM r0 GROUP BY m;
RETURN;
END;
GO
–Independent usage
SELECT * FROM [dbo].[ufnCSVToTable](this,is,a,comma,separated,sentence,‘,’)
Output:
colx
this
is
a
comma
separated
sentence


–Usage with table
DECLARE @str VARCHAR(1000); SET @str = ‘1,2’;
SELECT *
FROM Person.Address Ad
INNER JOIN [dbo].[ufnCSVToTable](@str,‘,’) CSV ON Ad.AddressID = CSV.colx

Voila!

I hope you find this function useful.

PostgreSQL Databases Backup & Restore with pg_dump command

Friday, February 11th, 2011

Make sure you should conncete to your server via ssh command line.
Login as super user type the following command.

[root@linux10 ~]#su – postgres

Get list of database(s) to backup.

[postgres@linux10 ~$]#psql -l

List of databases

Name | Owner | Encoding | Collation | Ctype | Access privileges

———–+———-+———-+————-+————-+———————–

postgres | postgres | UTF8 | en_GB.UTF-8 | en_GB.UTF-8 |

mine | mine | UTF8 | en_GB.UTF-8 | en_GB.UTF-8 |

template0| postgres | UTF8 | en_GB.UTF-8 | en_GB.UTF-8 | =c/postgres

:postgres=CTc/postgres

template1 | postgres | UTF8 | en_GB.UTF-8 | en_GB.UTF-8 | =c/postgres

: postgres=CTc/postgres

(4 rows)

~

~

(END)

Make a backup using pg_dump, pg_dump is a utility for backing up a PostgreSQL database. It dumps only one database at a time.

Dump a mine database:-

[postgres@linux10 ~$]#pg_dump mine > /opt/mine.out

To restore a mine database:-

[postgres@linux10 ~$]#psql -d mine -f mine.out

Or, If you have new server first create database then apply command:-

[postgres@linux10 ~$]#createdb mine
[postgres@linux10 ~$]#psql mine
[postgres@linux10 ~$]#psql -d mine -f mine.out

Second option is use to pg_dumpall command.It dumps (backs up) each database, and preserves cluster-wide data such as users and groups. You can use it as follows:-

[postgres@linux10 ~$]#pg_dumpall > /opt/all.db.out

To restore backup use the following command:

[postgres@linux10 ~$]#psql -f all.db.out postgres

How To query contiguous data ranges – Oracle

Saturday, January 22nd, 2011

I recently presented a way of querying contiguous data ranges using SQL Server In this post I am going to tackle the same problem but solving it in Oracle instead.
Often times we are confronted with problems that seem very easy to start with but once you get your teeth into it, you realise it wasn’t as easy as initially thought. In this blog, I am attempting to work with data ranges that are not easily groupable.
Problem Statement: Using the data structure listed below, group the data ranges by dataKey when the dataKey is contigous, when breaks in data key are identified, create a new reporting output line. Query input-

grp dataRange dataKey Comments
A 1000 1
A 1001 1
A 1002 1
A 1003 1
A 1004 1
A 1005 2 — Notice dataKey is not contiguous
A 1006 2
A 1007 1
A 1008 1
A 1009 1
A 1010 1
A 1011 1

Desired Query output-

dataKey DataRangeBreakDown
1 A [1000 .. 1004]
2 A [1005 .. 1006]
1 A [1007 .. 1011]

Approach: This is an interesting puzzle, because we cannot simply group the data by “dataKey” column, doing so would ignore the breaks in data. The key to solving this refreshing problem is by “somehow” introducing a pseudo-column which does the heavy lifting for us. Table below illustrates the new pseudo-column, if we introduce this column using SQL, rest is just simple group by. Before peeking at the final solution, I would encourage you to have a go, but if you can’t for any reason I have presented final solution later in the blog.

grp dataRange dataKey New Column
A 1000 1 0
A 1001 1 0
A 1002 1 0
A 1003 1 0
A 1004 1 0
A 1005 2 1
A 1006 2 1
A 1007 1 2
A 1008 1 2
A 1009 1 2
A 1010 1 2
A 1011 1 2

Setup Data:

 DROP TABLE tmpDataRange; CREATE TABLE
  tmpDataRange AS SELECT 'A' grp, 1000 AS dataRange, 1 AS dataKey FROM Dual UNION ALL
  SELECT 'A', 1001, 1 FROM Dual UNION ALL SELECT 'A', 1002, 1 FROM Dual UNION ALL SELECT
  'A', 1003, 1 FROM Dual UNION ALL SELECT 'A', 1004, 1 FROM Dual UNION ALL SELECT 'A',
  1005, 2 FROM Dual UNION ALL SELECT 'A', 1006, 2 FROM Dual UNION ALL SELECT 'A', 1007, 1
  FROM Dual UNION ALL SELECT 'A', 1008, 1 FROM Dual UNION ALL SELECT 'A', 1009, 1 FROM
  Dual UNION ALL SELECT 'A', 1010, 1 FROM Dual UNION ALL SELECT 'A', 1011, 1 FROM Dual ;
  

Solution:
Welcome to the solution. I have tackled the problem using the aproach above using Common Table Expressions (CTE) feature. I think CTEs are fantastic even for writing daily garden variety queries as they make the query so much more understandable by being written in the way we digest other information, i.e. Top to Bottom. Overview of solution query:
Data0 – Instantiate the problem statement input data
Data1 – Assign incremental row numbers to input data
Data2 – Self join the data to the previous row
Data3 – This is where the “magic” happens – Build a cumulative data column which increments on every dataKey break
Data4 – Is concerned with grouping and presenting the data in format requested Presented below is the final solution:

 WITH data0 AS (SELECT grp, dataRange, dataKey FROM
  tmpDataRange), data1 AS (SELECT grp, dataRange, dataKey, LAG(dataKey) OVER(ORDER BY
  dataRange ASC) lagDataKey FROM data0), data2 AS (SELECT grp, dataRange, dataKey, CASE
  WHEN dataKey=NVL(lagDataKey,dataKey) THEN 0 ELSE 1 END cumu FROM data1), data3 AS
  (SELECT grp, dataRange, dataKey, SUM(cumu) OVER(ORDER BY dataRange) cumulativeWindow
  FROM data2), data4 AS (SELECT cumulativeWindow, dataKey, MAX(grp) || ' [' ||
  TO_CHAR(MIN(dataRange)) ||' .. '|| TO_CHAR(MAX(dataRange)) ||']' DataRangeBreakDown
  FROM data3 GROUP BY cumulativeWindow, dataKey) SELECT dataKey, DataRangeBreakDown FROM
  data4; 

Variations: One of the great things about this solution is that it is data-type agnostic. Feel free to change the data-type of the dataRange column to DATE for example as shown below and it still works.

 DROP TABLE tmpDataRange; CREATE TABLE tmpDataRange AS SELECT 'A' grp,
  SYSDATE AS dataRange, 1 AS dataKey FROM Dual UNION ALL SELECT 'A', SYSDATE+1, 1 FROM
  Dual UNION ALL SELECT 'A', SYSDATE+2, 1 FROM Dual UNION ALL SELECT 'A', SYSDATE+3, 1
  FROM Dual UNION ALL SELECT 'A', SYSDATE+4, 1 FROM Dual UNION ALL SELECT 'A', SYSDATE+5,
  2 FROM Dual UNION ALL SELECT 'A', SYSDATE+6, 2 FROM Dual UNION ALL SELECT 'A',
  SYSDATE+7, 1 FROM Dual UNION ALL SELECT 'A', SYSDATE+8, 1 FROM Dual UNION ALL SELECT
  'A', SYSDATE+9, 1 FROM Dual UNION ALL SELECT 'A', SYSDATE+10, 1 FROM Dual UNION ALL
  SELECT 'A', SYSDATE+11, 1 FROM Dual ; 

Would love to hear your thoughts on this, or if you have a better solution why not share it with everyone.

How To query contiguous data ranges – SQL Server

Wednesday, January 12th, 2011

Often times we are confronted with problems that seem very easy to start with but once you get your teeth into it, you realise it wasn’t as easy as initially thought.
In this blog, I am attempting to work with data ranges that are not easily groupable.

Problem Statement:


Using the data structure listed below, group the data ranges by dataKey when the dataKey is contigous, when breaks in data key are identified, create a new reporting output line.
Query input-

grp dataRange dataKey Comments
A 1000 1
A 1001 1
A 1002 1
A 1003 1
A 1004 1
A 1005 2 — Notice dataKey is not contiguous
A 1006 2
A 1007 1
A 1008 1
A 1009 1
A 1010 1
A 1011 1

Desired Query output-

dataKey DataRangeBreakDown
1 A [1000 .. 1004]
2 A [1005 .. 1006]
1 A [1007 .. 1011]

Approach:


This is an interesting puzzle, because we cannot simply group the data by “dataKey” column, doing so would ignore the breaks in data. The key to solving this refreshing problem is by “somehow” introducing a pseudo-column which does the heavy lifting for us.
Table below illustrates the new pseudo-column, if we introduce this column using SQL, rest is just simple group by.

Before peeking at the final solution, I would encourage you to have a go, but if you can’t for any reason I have presented final solution later in the blog.

grp dataRange dataKey New Column
A 1000 1 0
A 1001 1 0
A 1002 1 0
A 1003 1 0
A 1004 1 0
A 1005 2 1
A 1006 2 1
A 1007 1 2
A 1008 1 2
A 1009 1 2
A 1010 1 2
A 1011 1 2

Setup Data:

IF OBJECT_ID('tempDB..#tmpDataRange') IS NOT NULL
DROP TABLE #tmpDataRange;
WITH testData AS (
 SELECT 'A' grp, 1000 AS dataRange, 1 AS dataKey UNION ALL
 SELECT 'A', 1001, 1 UNION ALL
 SELECT 'A', 1002, 1 UNION ALL
 SELECT 'A', 1003, 1 UNION ALL
 SELECT 'A', 1004, 1 UNION ALL
 SELECT 'A', 1005, 2 UNION ALL
 SELECT 'A', 1006, 2 UNION ALL
 SELECT 'A', 1007, 1 UNION ALL
 SELECT 'A', 1008, 1 UNION ALL
 SELECT 'A', 1009, 1 UNION ALL
 SELECT 'A', 1010, 1 UNION ALL
 SELECT 'A', 1011, 1
 )
SELECT grp, dataRange, dataKey
INTO #tmpDataRange
FROM testData;

Solution:


Welcome to the solution.
I have tackled the problem using the aproach above using Common Table Expressions (CTE) feature of SQL Server 2005+ Engine. CTEs are great new addition to query toolkit, besides allowing us to do fancy stuff like hierarchical queries, I think CTEs are fantastic even for writing daily garden variety queries as they make the query so much more understandable by being written in the way we digest other information, i.e. Top to Bottom.
Overview of solution query:
Data0 – Instantiate the problem statement input data
Data1 – Assign incremental row numbers to input data
Data2 – Self join the data to the previous row
Data3 – This is where the “magic” happens – Build a cumulative data column which increments on every dataKey break
Data4 and Data5 – Are concerned with grouping and presenting the data in format requested

Presented below is the final solution:

WITH
data0 AS (SELECT grp, dataRange, dataKey FROM #tmpDataRange),
data1 AS (SELECT UPPER(grp) grp,dataRange,dataKey, ROW_NUMBER()OVER(ORDER BY dataRange ASC) rNum FROM data0),
data2 AS (SELECT A.grp,A.dataRange,A.dataKey,A.rNum,
 CASE WHEN A.dataKey=ISNULL(B.dataKey,A.dataKey)THEN 0 ELSE 1 END cumuWindow
 FROM data1 A LEFT JOIN data1 B ON A.rNum=b.rNum+1),
data3 AS (SELECT o.grp,o.dataRange,o.dataKey,
 (SELECT SUM(i.cumuWindow) FROM data2 i WHEREi.rNum <= o.rNum) cumuWindow
 FROM data2 o),
data4 AS (SELECT dataKey,cumuWindow,MIN(grp) grp, MIN(dataRange)mn, MAX(dataRange) mx
 FROM data3
 GROUP BY dataKey,cumuWindow)
SELECT dataKey, grp + '['+CAST(mn AS VARCHAR(10))+'..'+CAST(mx ASVARCHAR(10))+']' DataRangeBreakDown
FROM data4

Variations:
One of the great things about this solution is that it is data-type agnostic. Feel free to change the data-type of the dataRange column to DATE for example as shown below and it still works.

IF OBJECT_ID('tempDB..#tmpDataRange') IS NOT NULL
DROP TABLE #tmpDataRange;
WITH testData AS (
 SELECT 'A' grp, GETDATE() AS dataRange, 1 AS dataKey UNION ALL
 SELECT 'A', GETDATE()+1, 1 UNION ALL
 SELECT 'A', GETDATE()+2, 1 UNION ALL
 SELECT 'A', GETDATE()+3, 1 UNION ALL
 SELECT 'A', GETDATE()+4, 1 UNION ALL
 SELECT 'A', GETDATE()+5, 2 UNION ALL
 SELECT 'A', GETDATE()+6, 2 UNION ALL
 SELECT 'A', GETDATE()+7, 1 UNION ALL
 SELECT 'A', GETDATE()+8, 1 UNION ALL
 SELECT 'A', GETDATE()+9, 1 UNION ALL
 SELECT 'A', GETDATE()+10, 1 UNION ALL
 SELECT 'A', GETDATE()+11, 1
 )
SELECT grp, dataRange, dataKey
INTO #tmpDataRange
FROM testData;

Would love to hear your thoughts on this, or if you have a better solution why not share it with everyone.

I am sure you would like to know how the same problem can be solved in Oracle. Find it here

Convert Outlook PST to MBOX in Snow Leopard

Sunday, June 6th, 2010


Here are the steps that I tried successfully on Snow Leopard:

  1. Download libpst source from http://www.five-ten-sg.com/libpst/packages/. Extract it in ~/Downloads directory.
  2. Download boost source from http://sourceforge.net/projects/boost/files/boost/. (will be downloaded to ~/Downloads directory by default)
  3. Open Terminal and run following commands: (replace X_XX_X and x.x.xx with the version of boost and libpst that you’ve downloaded):
    $su - root
    #cd /usr/local
    #tar --bzip2 -xf ~/Downloads/boost_X_XX_X.tar.bz2
    #cd  /usr/local/boost_X_XX_X
    #./bootstrap.sh --prefix=/usr/local/
    #./bjam install
    #cd  ~/Downloads/libpst-x.x.xx
    #./configure
    #make
    #make install
    
  4. Run this command on terminal – in the directory you want the MBOX created
  5. $readpst -r /path/pstfile.pst
  • © 2004-2015 Special Work & Technology Limited