p2p.wrox.com Forums

p2p.wrox.com Forums (http://p2p.wrox.com/index.php)
-   XSLT (http://p2p.wrox.com/forumdisplay.php?f=86)
-   -   Transforming data in required XML Format (http://p2p.wrox.com/showthread.php?t=47389)

pvasudevan September 5th, 2006 02:52 AM

Transforming data in required XML Format
 
Hi, I'm trying to transform the data from a text file were data is in a tab-delimited format to an xml format, which is as given below:
<UserAccessDS>
    <Access>
        <ww />
        <un />
        <rl />
    </Access>
</UserAccessDS>

The data in the text file is as given below:
1065,steo,PARIS_SPEND_DV_TPP
1099,steo,JAPAN_SPEND_IOP_TPP
1015,steo,PARIS_FUTURE_LOK_TPP

I'm expecting the output xml to be as following:
<UserAccessDS>
    <Access>
        <ww>1065</ww>
        <un>steo</un>
        <rl>PARIS_SPEND_DV_TPP</rl>
    </Access>
    <Access>
        <ww>1099</ww>
        <un>steo</un>
        <rl>JAPAN_SPEND_IOP_TPP</rl>
    </Access>
    <Access>
        <ww>1015</ww>
        <un>steo</un>
        <rl>PARIS_FUTURE_LOK_TPP</rl>
    </Access>
</UserAccessDS>

I'm using an xslt stylesheet to transform it to the required xml format using saxon transformer. Following are the parameters that identifies the root element, new line element and innermost value element:
  <xsl:param name="root" select="'UserAccessDS'"/>
  <xsl:param name="line" select="'Access'"/>
  <xsl:param name="entry" select="'ww'"/>

The innermost template that reads the innermost value from the text file delimited by a comma and writes it as the innermost element in the xml format is as follows:
<xsl:template name="tValues">
    <xsl:param name="value" select="''"/>
    <xsl:analyze-string select="$value" regex=",|\t">
      <xsl:matching-substring/>
      <xsl:non-matching-substring>
        <xsl:element name="{$entry}">
          <xsl:value-of select="."/>
        </xsl:element>
      </xsl:non-matching-substring>
    </xsl:analyze-string>
</xsl:template>

Currently my xml output is formatted as follows:
<UserAccessDS>
    <Access>
        <ww>1065</ww>
        <ww>steo</ww>
        <ww>PARIS_SPEND_DV_TPP</ww>
    </Access>
    <Access>
        <ww>1099</ww>
        <ww>steo</ww>
        <ww>JAPAN_SPEND_IOP_TPP</ww>
    </Access>
    <Access>
        <ww>1015</ww>
        <ww>steo</ww>
        <ww>PARIS_FUTURE_LOK_TPP</ww>
    </Access>
</UserAccessDS>

As you can see from the above format, the innermost element has the opening and closing tags as <ww>data</ww>. What I want is that each element value to be seperately placed using a different tag name. i.e. for each value between the comma, should be written inside a new tag name. So wanted to know if I could achive this, by modifying the initial parameter as:
  <xsl:param name="entry" select="'ww','un','rl'"/>
And then in the template, I could add a <xsl:for-each select="$entry"> and then map is appropriately.

Can anyone give me a tip on how I could achieve this.

Thanks,
Praveen


mhkay September 5th, 2006 04:04 AM

Inside xsl:analyze-string, position() gives you the position in the sequence of matching and non-matching substrings. So the non-matching ones are at positions 1,3,5 etc. So you should be able to write:

     <xsl:non-matching-substring>
        <xsl:variable name="p" select="(position()+1) div 2"/>
        <xsl:element name="{$entry[$p]}">
          <xsl:value-of select="."/>
        </xsl:element>
      </xsl:non-matching-substring>


Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference

pvasudevan September 5th, 2006 05:49 AM

Thank you Michael, that worked like a charm.

I have one more question about eliminating a particular value of the CSV row from the input file. For instance, if the input data is of the format:

1065,steo,PARIS_SPEND_DV_TPP,Enterprise:SAPPayroll
1015,steo,PARIS_FUTURE_LOK_TPP,Enterprise:SAPTrans actionsDetail

Then while transforming this data to the xml format, I'd like to skip the 3rd element such as PARIS_SPEND_DV_TPP & PARIS_FUTURE_LOK_TPP, such that the final output would be:

<UserAccessDS>
    <Access>
        <ww>1065</ww>
        <un>steo</un>
        <rl>Enterprise:SAPPayroll</rl>
    </Access>
    <Access>
        <ww>1015</ww>
        <un>steo</un>
        <rl>Enterprise:SAPTransactionsDetail</rl>
    </Access>
</UserAccessDS>

I guess I should modify inside <xsl:value-of select="."/> to achive this I suppose. Could you help me on this as well.

Many thanks,
Praveen


pvasudevan September 5th, 2006 07:21 AM

So I tried formatting the template's elements to include like:

  <xsl:template name="tValues">
    <xsl:param name="value" select="''"/>
    <xsl:analyze-string select="$value" regex=",|\t">
      <xsl:matching-substring/>
      <xsl:non-matching-substring>
      <xsl:variable name="p" select="(position()+1) div 2"/>
        <xsl:element name="{$entry[$p]}">
    <xsl:choose>
          <xsl:when test="{$p}=5">
        <xsl:value-of select="following-sibling::{$value[4]}"/>
      </xsl:when>
    <xsl:otherwise>
        <xsl:value-of select="."/>
    </xsl:otherwise>
    </xsl:choose>
        </xsl:element>
      </xsl:non-matching-substring>
    </xsl:analyze-string>
  </xsl:template>

but I guess, am making some syntatical errors. this is primarily because am new to XSLT programming.

First of all: writing a statement like this <xsl:when test="{$p}=5">
where the variable value is being tested itself is erroring out. Also I wonder why there is no "else" part to the <xsl:if> element.

Folloiwng-sibling, i guess works more w.r.t data already formatted in xml format and then we are trying to read. but in my case, the data i'm trying to read is in csv format from a text file, so in wonder if following-sibling would work or not. i also wonder if writing a syntax like this <xsl:value-of select=".."/> after a valid "if" condition would work.

primarily, i first need to understand how to write a valid "if" statement, then how i could write "value-of select" correctly to read the next context element.

Any tips I could get, that'll be gr8.

Thanks,
Praveen


joefawcett September 5th, 2006 08:03 AM

XSLT is not a language you can pick up as you go along, you need to work through some basic documentation first.
I can't count the number of people who've asked for help because they were too busy to learn the basics and have spent at least three times as long getting their code to work.

For example the use of attribute value templates is restricted, they are not needed in select, match or test attributes.
Code:

<xsl:when test="{$p}=5">
=>
Code:

<xsl:when test="$p = 5">
--

Joe (Microsoft MVP - XML)

mhkay September 5th, 2006 08:11 AM

>First of all: writing a statement like this <xsl:when test="{$p}=5"> where the variable value is being tested itself is erroring out.

You want test="$p=5". You never use curly brackets inside an XPath expression, only to separate an XPath expression from surrounding text in an Attribute Value Template.

> Also I wonder why there is no "else" part to the <xsl:if> element.

Because there is an xsl:choose instruction for that purpose

>Folloiwng-sibling, i guess works more w.r.t data already formatted in xml format and then we are trying to read. but in my case, the data i'm trying to read is in csv format from a text file, so in wonder if following-sibling would work or not.

following-sibling, like the other XPath axes, navigates within an XML tree. You can't use it unless (or until) you text is structured as a tree.

><xsl:when test="{$p}=5">
        <xsl:value-of select="following-sibling::{$value[4]}"/>

I'm afraid this isn't XSLT, I can't guess what you intend it to mean, so I can't tell you the correct code to write in its place.

Michael Kay
http://www.saxonica.com/
Author, XSLT Programmer's Reference and XPath 2.0 Programmer's Reference

pvasudevan September 6th, 2006 02:34 AM

Thanks for all the inputs. The complete XSLT that I have to do the transformation is pasted below:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">

  <xsl:param name="doc" select="'../userinput/comma-edw.txt'"/>

  <xsl:param name="enc" select="'UTF-8'"/>

  <xsl:param name="root" select="'UserAccessDS'"/>

  <xsl:param name="line" select="'Access'"/>

  <xsl:param name="entry" select="'ww','un','rl'"/>

  <!--
    main template
  -->
  <xsl:template match="/">
    <xsl:element name="{$root}">
      <xsl:call-template name="tLines">
        <xsl:with-param name="value" select="unparsed-text($doc, $enc)"/>
      </xsl:call-template>
    </xsl:element>
  </xsl:template>
  <!--
    tokenize lines
  -->
  <xsl:template name="tLines">
    <xsl:param name="value" select="''"/>
    <xsl:analyze-string select="$value" regex="\n|\r">
      <xsl:matching-substring/>
      <xsl:non-matching-substring>
        <xsl:element name="{$line}">
          <xsl:call-template name="tValues">
            <xsl:with-param name="value" select="."/>
          </xsl:call-template>
        </xsl:element>
      </xsl:non-matching-substring>
    </xsl:analyze-string>
  </xsl:template>

  <xsl:template name="tValues">
    <xsl:param name="value" select="''"/>
    <xsl:analyze-string select="$value" regex=",|\t">
      <xsl:matching-substring/>
      <xsl:non-matching-substring>
      <xsl:variable name="p" select="(position()+1) div 2"/>
        <xsl:element name="{$entry[$p]}">
    <xsl:choose>
        <xsl:when test="$p=5">
                  <xsl:value-of select="."/>
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="."/>
        </xsl:otherwise>
    </xsl:choose>
        </xsl:element>
      </xsl:non-matching-substring>
    </xsl:analyze-string>
  </xsl:template>
</xsl:stylesheet>

So Michael, the output XML that I'm trying to generate should be like given below:

<UserAccessDS>
    <Access>
        <ww>1065</ww>
        <un>steo</un>
        <rl>Enterprise:SAPPayroll</rl>
    </Access>
    <Access>
        <ww>1015</ww>
        <un>steo</un>
        <rl>Enterprise:SAPTransactionsDetail</rl>
    </Access>
</UserAccessDS>

The input data from the comma-seperated text file is like shown below:

1065,steo,PARIS_SPEND_DV_TPP,Enterprise:SAPPayroll
1015,steo,PARIS_FUTURE_LOK_TPP,Enterprise:SAPTrans actionsDetail

So as you can see in the output xml is it has a tag <rl />, which encloses the data from the 4th column in the input comma-seperated row and not the 3rd column value. For instance, <rl>Enterprise:SAPTransactionsDetail</rl> has the data from the 4th column within the <rl /> tag and not the 3rd column value PARIS_FUTURE_LOK_TPP. I understand that in the template tValues I'm supposed to handle this.

<xsl:non-matching-substring>
      <xsl:variable name="p" select="(position()+1) div 2"/>
        <xsl:element name="{$entry[$p]}">
    <xsl:choose>
        <xsl:when test="$p=5">
                  I need to select the 4th column value from the input data
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="."/>
        </xsl:otherwise>
    </xsl:choose>
        </xsl:element>
      </xsl:non-matching-substring>

In my previous attempt, I tried using following-sibling, but I failed miserably. From your recent post, I learnt that following-sibling would work only from an exiting xml tree structure and not from any kind of data.

My problem now is to read from the input data and take the 4th column value and write it within the <rl /> tag. I hope am able to clearly let you know my requirements.

Please let me know on how I could achieve this.

Many Thanks,
Praveen



All times are GMT -4. The time now is 04:49 PM.

Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
© 2013 John Wiley & Sons, Inc.