Converting XML files into Lotus Notes documents
This LotusScript agent from SearchDomino.com member Genesio Zomparelli parses XML into a standard DOM (Document Object Model) tree using the NotesDOMParser class.
This LotusScript agent uses the NotesDOMParser class to parse XML into a standard DOM (Document Object Model) tree. Additional NotesDOM classes are used to work with the DOM trees. The NotesDOMParser class is used instead of the NotesSAXParser class because the XML files we receive from Rhapsody are relatively small and consist of simple data structures. The SAX parser is designed to process large, complex XML files.
Option Public 'Set DEBUG to 1 to turn on debugging mode; 0 to turn off debugging mode Const DEBUG% = 0 Const pathXMLFile$ = "d:rhapsodyemrsdevin" Const processedPath$ = "d:rhapsodyemrsdevinprocessed" Const rejectPath$ = "d:rhapsodyemrsdevinrejects" Declarations Dim docCount As Integer Dim fileCount As Integer Dim invalidFieldError As Integer Dim resultsCount As Integer Dim stfrrCount As Integer Dim domLogFile As String Dim fieldName As String Dim header As String Dim message As String Dim NL As String Dim origXML As String Dim pathInfo As String Dim processedFile As String Dim rejectFile As String Dim sourceInfo As String Dim statusText As String Dim thisForm As String Dim xmlFilePath As String Dim fieldValue As Variant Dim db As NotesDatabase Dim domParser As NotesDOMParser Dim doc As NotesDocument Dim Form As NotesForm Dim dateTimeItem As NotesItem Dim numberItem As NotesItem Dim textItem As NotesItem Dim session As NotesSession Dim inputStream As NotesStream Dim outputStream As NotesStream Sub Initialize 'CreateDocsFromXML ------------------------------------------- ----------------------------------> 'This agent parses the origXML file and creates one new Lotus Notes Document 'using the Sample Test Result (STFRR) form.Fields 'DOM performs best with smaller XML documents that contain simple data structures. 'This method is not recommended for parsing large, complex XML documents. ' The NotesStream and DOMParser classes are new in Domino 6. ' Stream represents files as streams of binary character data. ' DOM documents are representations of XML documents, with each element, ' attribute, text value, or processing instruction in the source XML ' document represented by a node in the DOM. ' Get the name of the XML source file. ' The following code may have to be modified when we find out how we will get ' the XML files created by Rhapsody. '<------------------------------------------------ ------------------------------------------------------- Dim allXMLFiles As String Dim docNode As NotesDOMDocumentNode allXMLFiles = pathXMLFile + "*.xml" origXML = Dir$( allXMLFiles, 0 ) xmlFilePath = pathXMLFile + origXML processedFile = processedPath + origXML NL = Chr(13) + Chr(10) dateProcessed$ = Today() message = "----- DOM Parser Report - Create Notes Docs From XML - GAZ: " + dateProcessed$ + "-----" domLogFile = "c:rhapsodyemrsdevindomlog CreateDocsFromXMLlog" + _ Format$( Now, "yymmddhhnnss" ) + ".txt" Set session = New NotesSession Set db = session.CurrentDatabase Set outputStream = session.CreateStream outputStream.Open ( domLogFile ) outputStream.Truncate fileCount = 1 docCount = 0 invalidFieldError = 0 Print message outputStream.WriteText ( message + NL ) On Error Goto errh Do While origXML <> "" Set inputStream = session.CreateStream If Not inputStream.Open( xmlFilePath ) Then message = "DOM parser cannot open " + origXML + "! Processing Stopped!" Print message Goto results End If If inputStream.Bytes = 0 Then message = origXML + " is empty! Processing Stopped!" Print message Goto results End If resultsCount = 0 stfrrCount = 0 Set domParser = session.CreateDOMParser ( inputStream, outputStream ) domParser.Process Set docNode = domParser.Document Set Form = db.GetForm( "STFRR" ) thisForm = "STFRR" message = |Start parsing CaPP XML File: | + origXML + | from Rhapsody using DOM...| Print message outputStream.WriteText ( message + NL + NL ) Call walkNameValue( docNode ) '----- Finish the last Test Results document in this XML file ----- If invalidFieldError = 1 Then Call SendErrorMsg doc.ImportStatus = statusText Call doc.Save( False, False ) Else doc.ImportStatus = "Ready" Call doc.Save( False, False ) End If Print Message domParser.Output( message + NL + NL ) message = NL + "Finished parsing " + origXML + ". " + NL + "A total of " + Cstr( resultsCount ) + _ " Test Results documents were created from this file." + NL + "Ready to parse the next XML file." Print message domParser.Output( message + NL + NL ) '----- Saved the last Test Results documents in this XML file fileCount = fileCount + 1 docCount = docCount + resultsCount invalidFieldError = 0 message = "" Call moveFile( inputStream, processedFile ) origXML = Dir$() xmlFilePath = pathXMLFile + origXML processedFile = processedPath + origXML Loop dateProcessed$ = Today() message = NL + |This DOM Parser Agent processed | + Cstr( fileCount - 1) & | XML files on | & dateProcessed$ + NL + _ | A total of | + Cstr( docCount ) + | Sample Test Results documents were added to the CaPP Orphanage.| Print message results: outputStream.WriteText ( message ) Call outputStream.Close Exit Sub errh: If Err = 53 Then 'A File Not Found error occurs when an invalid XML file is moved to the rejects folder 'Rather than abort the agent just start parsing the next file in the directory Resume Next End If message = Cstr( Err ) + ": " + Error + Chr( 13 ) Resume results End Sub Sub walkNameValue( node As NotesDOMNode ) 'DOM documents are representations of XML documents, with each element, 'attribute, text value, or processing instruction in the source XML 'document represented by a node in the DOM. 'This subroutine parses the DOM tree and gets the instance and message information 'then it gets field names and their respective values and stores 'this information in the Sample Test Results form (STFRR) Dim child As NotesDOMNode Dim eNode As NotesDOMElementNode Dim attribs As NotesDOMNamedNodeMap 'Long Datatype Dim a As NotesDOMAttributeNode 'Attribute Name and Attribute Value Dim numChildNodes As Integer If Not node.IsNull Then 'Process each node type found by the DOM parser If DEBUG = 1 Then message = "The node name is: " + node.NodeName Print cstrNodeType( node ) Print message domParser.Output( CstrNodeType ( node ) + NL ) domParser.Output( message + NL ) End If Select Case node.NodeType Case DOMNODETYPE_DOCUMENT_NODE: If DEBUG = 1 Then message = node.NodeName + " has the value: null" Print message domParser.Output( message + NL ) End If ' Get the first node Set child = node.FirstChild numChildNodes = node.NumberOfChildNodes If DEBUG = 1 Then If numChildNodes < 2 Then message = node.NodeName + " has only " + Cstr( numChildNodes ) + " Child Node" Print message domParser.Output( message + NL ) Else message = node.NodeName + " has " + Cstr( numChildNodes ) + " Child Nodes" Print message domParser.Output( message + NL ) End If End If 'Get all the children for this node While numChildNodes > 0 If DEBUG = 1 Then If numChildNodes < 2 Then message = "Listing the child of " + node.NodeName + ":" Print message domParser.Output( message + NL + NL ) Else message = "Listing all the children of " + node.NodeName + ":" Print message domParser.Output( message + NL + NL ) End If End If 'Continue parsing the DOM Tree Call walkNameValue( child ) 'Get next node Set child = child.NextSibling numChildNodes = numChildNodes - 1 If DEBUG = 1 Then If numChildNodes = 1 Then message = node.NodeName + " has " + Cstr( numChildNodes ) + " more child to process..." Print message domParser.Output( message + NL + NL ) Else message = node.NodeName + " has " + Cstr( numChildNodes ) + " more children to process..." Print message domParser.Output( message + NL + NL ) End If End If Wend Case DOMNODETYPE_ELEMENT_NODE: 'This node is where we get field names If DEBUG = 1 Then message = node.NodeName + " has the value: null" Print message domParser.Output( message + NL ) End If Set eNode = node Dim TagNameLength As Integer TagNameLength = Len( eNode.TagName ) If DEBUG = 1 Then message = "The Tag name of " + node.NodeName + | is "| + eNode.TagName + |"| Print message domParser.Output( message + NL ) End If 'TRACK THE NUMBER OF TEST RESULTS DOCS CREATED FROM THIS XML FILE If Instr( eNode.TagName, "FormSTFRR" ) <> 0 Then stfrrCount = stfrrCount + 1 If stfrrCount > resultsCount Then Call doc.Save( False, False) 'Check for data in required fields validSampleFormUNID = doc.GetItemValue( "SampleFormUNID" ) validLSR104UNID = doc.GetItemValue( "LSR104DocUniqueID" ) validInvestUNID = doc.GetItemValue( "Invest_UNID" ) If validSampleFormUNID( 0 ) = "" Then statusText = "SampleFormUNID not found" fieldName = "SampleFormUNID" Call requiredField( statusText, sourceInfo, pathInfo, fieldName ) resultsCount = resultsCount + 1 Call doc.CopyToDatabase( db ) End If If validLSR104UNID( 0 ) = "" Then statusText = "LSR104DocUniqueID not found" fieldName = "LSR104DocUniqueID" Call requiredField( statusText, sourceInfo, pathInfo, fieldName ) resultsCount = resultsCount + 1 Call doc.CopyToDatabase( db ) End If If validInvestUNID( 0 ) = "" Then statusText = "Invest_UNID not found" fieldName = "Invest_UNID" Call requiredField( statusText, sourceInfo, pathInfo, fieldName ) resultsCount = resultsCount + 1 Call doc.CopyToDatabase( db ) End If If invalidFieldError = 0 Then doc.ImportStatus = "Ready" statusText = "Ready" doc.ImportSource = sourceInfo doc.ImportPath = pathInfo Call doc.Save( False, False) resultsCount = resultsCount + 1 Call doc.CopyToDatabase( db ) doc.ImportStatus = "Processing" End If End If If DEBUG = 1 Then message = "The field: ImportStatus" + " was added to " + thisForm + " with the value: " + "Ready" Print message domParser.Output( message + NL ) message = "The field: ImportSource" + " was added to " + thisForm + " with the value: " + sourceInfo Print message domParser.Output( message + NL ) message = "The field: ImportPath" + " was added to " + thisForm + " with the value: " + pathInfo Print message domParser.Output( message + NL ) message = "Saving Sample Test Results document #: " + Cstr( resultsCount ) Print message domParser.Output( message + NL ) message = "Creating Sample Test Results document #: " + Cstr( resultsCount ) Print message domParser.Output( message + NL ) End If Else fieldName = eNode.TagName End If Set child = node.FirstChild numChildNodes = node.NumberOfChildNodes If DEBUG = 1 Then If numChildNodes = 1 Then message = eNode.TagName + " has " + Cstr( numChildNodes ) + " Child Node" Print message domParser.Output( message + NL ) Else message = eNode.TagName + " has " + Cstr( numChildNodes ) + " Child Nodes" Print message domParser.Output( message + NL ) End If End If 'The path to the parent instance is an attribute of the node's child Dim numAttributes As Integer numAttributes = eNode.attributes.numberofentries If DEBUG = 1 Then If numAttributes = 1 Then message = eNode.TagName + " has " + Cstr( numAttributes ) + " Attribute" Print message + NL domParser.Output( message + NL + NL ) Else message = eNode.TagName + " has " + Cstr( numAttributes ) + " Attributes" Print message + NL domParser.Output( message + NL + NL ) End If End If Set attribs = eNode.Attributes Dim i As Integer For i = 1 To numAttributes Set a = attribs.GetItem( i ) If DEBUG = 1 Then message = "Attribute Name: " + a.NodeName + "; Value: " + a.NodeValue + NL Print message domParser.Output( message + NL ) End If 'Check the attribute's value to see if it has the EMRS instance we need If Instr( a.NodeValue, "nsf" ) = 0 Then 'The parent instance was not found in this XML file statusText = "Instance Not Found" Call requiredField( statusText, sourceInfo, pathInfo, fieldName ) Else 'THE PATH TO THE PARENT INSTANCE WAS FOUND fieldName = a.Nodename fieldValue = a.Nodevalue Set textItem = doc.ReplaceItemValue ( fieldName, fieldValue ) If DEBUG = 1 Then Call AddedFieldMsg ( fieldName, fieldValue ) End If End If Next 'Get next child While numChildNodes > 0 Call walkNameValue( child ) Set child = child.NextSibling numChildNodes = numChildNodes - 1 If DEBUG = 1 Then If numChildNodes = 1 Then message = eNode.TagName + " has " + Cstr( numChildNodes ) + " more child to process..." Print message domParser.Output( message + NL ) Else message = eNode.TagName + " has " + Cstr( numChildNodes ) + " more children to process..." Print message domParser.Output( message + NL ) End If End If Wend If DEBUG = 1 Then message = "DOM is finished listing all of the children of " + eNode.TagName Print NL + message domParser.Output( NL + message + NL + NL ) End If Case DOMNODETYPE_DOCUMENTTYPE_NODE: If DEBUG = 1 Then 'This node type is not required message = node.NodeName + " has the value: null" Print message domParser.Output( message + NL) End If Case DOMNODETYPE_TEXT_NODE: 'THIS IS WHERE WE FIND THE FIELD VALUES Dim valueLength As Integer Dim fieldLength As Integer fieldValue = Fulltrim( node.NodeValue ) valueLength = Len( fieldValue ) fieldLength = Len ( fieldName ) If DEBUG = 1 Then message = "Field: " + fieldName + " Value: " + fieldValue Print message domParser.Output( message + NL) End If If fieldLength > 0 And valueLength > 0 Then If fieldName = "ImportSource" Then 'This is when we create a new orphan document sourceInfo = fieldValue resultsCount = resultsCount + 1 Set doc = New NotesDocument( db ) doc.Form = thisForm Set textItem = doc.ReplaceItemValue ( fieldName, fieldValue ) textItem.IsSummary = True Set textItem = doc.ReplaceItemValue ( ImportStatus, "Processing" ) message = "The field: " + "ImportStatus" + " was added to " + thisForm + " with the value: " + "Processing" Print message domParser.Output( message + NL) Else 'All other fields must be saved with the proper datatype Call getDataType ( doc, fieldName, fieldValue, thisForm ) If DEBUG = 1 Then Call AddedFieldMsg( fieldName, fieldValue ) End If End If If fieldName = "ImportPath" Then pathInfo = fieldValue End If End If 'STOP HERE: The following nodes are not used by this agent Case DOMNODETYPE_COMMENT_NODE: If DEBUG = 1 Then 'This node type is not required message = node.NodeName + " has the value: " + node.NodeValue Print message domParser.Output( message + NL + NL ) End If Case DOMNODETYPE_PROCESSING INSTRUCTION_NODE: If DEBUG = 1 Then 'This node type is not required Set piNode = node message = node.NodeName + " has processing instruction with Target " + piNode.Target + " and Data " + piNode.Data Print message domParser.Output( message + NL + NL ) End If Case DOMNODETYPE_ENTITY_NODE: If DEBUG = 1 Then 'This node type is not required message = node.NodeName + " has the value: " + node.NodeValue Print message domParser.Output( message + NL + NL ) End If Case DOMNODETYPE_ENTITY REFERENCE_NODE: If DEBUG = 1 Then 'This node type is not required message = node.NodeName + " has the value: null" Print message domParser.Output( message + NL + NL ) End If Case DOMNODETYPE_CDATASECTION_NODE: If DEBUG = 1 Then 'This node type is not required message = node.NodeName + " has the value: " + node.NodeValue Print message domParser.Output( message + NL + NL ) End If Case DOMNODETYPE_ATTRIBUTE_NODE: If DEBUG = 1 Then 'This node type is not required message = node.NodeName + " has the value: " + node.NodeValue Print message domParser.Output( message + NL + NL ) End If Case DOMNODETYPE_DOCUMENT FRAGMENT_NODE: If DEBUG = 1 Then 'This node type is not required message = node.NodeName + " has the value: " + node.NodeValue Print message domParser.Output( message + NL + NL ) End If Case DOMNODETYPE_NOTATION_NODE: If DEBUG = 1 Then 'This node type is not required message = node.NodeName + " has the value: " + node.NodeValue Print message domParser.Output( message + NL + NL ) End If Case DOMNODETYPE_XMLDECL_NODE: If DEBUG = 1 Then 'This node type is not required message = node.NodeName + " has the value: " + node.NodeValue Print message domParser.Output( message + NL + NL ) End If Case Else: message = "Ignoring node: " + Cstr( node.NodeType ) Print message domParser.Output( message + NL ) End Select 'node.NodeType End If 'Not node.IsNull End Sub Function CstrNodeType (currentNode As NotesDOMNode) As String If Not currentNode.IsNull Then Select Case currentNode.NodeType Case DOMNODETYPE_ELEMENT_NODE: CstrNodeType = "DOMNODETYPE_ ELEMENT_NODE" Case DOMNODETYPE_ATTRIBUTE_NODE: CstrNodeType = "DOMNODETYPE_ ATTRIBUTE_NODE" Case DOMNODETYPE_TEXT_NODE: CstrNodeType = "DOMNODETYPE_ TEXT_NODE" Case DOMNODETYPE_CDATASECTION_NODE: CstrNodeType = "DOMNODETYPE_ CDATASECTION_NODE" Case DOMNODETYPE_ENTITYREFERENCE_NODE: CstrNodeType = "DOMNODETYPE_ ENTITYREFERENCE_NODE" Case DOMNODETYPE_ENTITY_NODE: CstrNodeType = "DOMNODETYPE_ ENTITY_NODE" Case DOMNODETYPE_PROCESSING INSTRUCTION_NODE: CstrNodeType = "DOMNODETYPE_ PROCESSINGINSTRUCTION_NODE" Case DOMNODETYPE_COMMENT_NODE: CstrNodeType = "DOMNODETYPE_ COMMENT_NODE" Case DOMNODETYPE_DOCUMENT_NODE: CstrNodeType = "DOMNODETYPE_ DOCUMENT_NODE" Case DOMNODETYPE_ DOCUMENTTYPE_NODE: CstrNodeType = "DOMNODETYPE_ DOCUMENTTYPE_NODE" Case DOMNODETYPE_ DOCUMENTFRAGMENT_NODE: CstrNodeType = "DOMNODETYPE_ DOCUMENTFRAGMENT_NODE" Case DOMNODETYPE_ NOTATION_NODE: CstrNodeType = "DOMNODETYPE_ NOTATION_NODE" Case DOMNODETYPE_ XMLDECL_NODE: CstrNodeType = "DOMNODETYPE_ XMLDECL_NODE" End Select Else CstrNodeType = "Null Node" End If End Function Sub SendErrorMsg 'Send error message with original XML file attached 'Move the faulty XML file to the Rejects folder Dim memodoc As NotesDocument Dim richitem As NotesRichTextItem Dim object As NotesEmbeddedObject Dim recipients (1 To 3 ) As String 'Dim recipients( 1 ) As String Set memodoc = New NotesDocument( db ) processedFile = rejectPath + origXML message = "There was an error processing the following XML file: " + origXML + NL + NL + _ "The problem was: " + statusText + NL + NL + _ "The file data source originated from: " + sourceInfo + NL + NL + _ "The XML file was created by: " + pathInfo + NL Print message domParser.Output( message + NL ) Call moveFile( inputStream, processedFile ) memodoc.Form = "Memo" recipients( 1 ) = "Geno A Zomparelli/CO/APHIS/USDA" recipients( 2 ) = "Luke J Wagner/CO/APHIS/USDA" recipients( 3 ) = "Timothy A Rowell/CO/APHIS/USDA" 'recipients( 4 ) = "Terry H Matson/CO/APHIS/USDA" memodoc.SendTo = recipients memodoc.Subject = "ERROR PROCESSING XML FILE {DEMO}" memodoc.Body = message Set richitem = New NotesRichTextItem ( memodoc, "object" ) Set object = richitem.EmbedObject ( EMBED_ATTACHMENT, "", processedFile) Call memodoc.Send( False, recipients ) message = "" End Sub Sub AddedFieldMsg( fieldName, fieldValue ) message = "The field: " + fieldName + " was added to " + thisForm + " with the value: " + fieldValue Print message domParser.Output( message + NL ) message = "" End Sub Sub moveFile( inputStream, processedFile ) Call inputStream.Close Filecopy xmlFilePath, processedFile Kill xmlFilePath End Sub Sub requiredField( statusText, sourceInfo, pathInfo, fieldName ) message = "The file named " + origXML + " could not be processed because " + _ fieldName + " either did not have a value or the value was not valid." Print message domParser.Output( message + NL + NL ) doc.ImportStatus = statusText doc.ImportSource = sourceInfo doc.ImportPath = pathInfo invalidFieldError = 1 Call doc.Save( False, False ) End Sub Sub getDataType( doc, fieldName, fieldValue, thisForm ) Dim textItem As NotesItem Dim dateTimeItem As NotesItem Dim numberItem As NotesItem Dim attachItem As NotesItem Dim embeddedItem As NotesEmbeddedObject Dim errItem As NotesItem Dim nameItem As NotesName Dim authorItem As NotesName Dim readerItem As NotesName Dim linksItem As NotesRichTextDoclink Dim refsItem As NotesItem Dim rtItem As NotesRichTextItem Dim sigItem As NotesItem Dim rtRangeItem As NotesRichTextRange Dim Form As NotesForm Dim fieldType As Variant Set Form = db.GetForm( thisForm ) fieldType = Form.GetFieldType( fieldName ) If DEBUG = 1 Then Print fieldType End If Select Case fieldType Case TEXT: Set textItem = doc.ReplaceItemValue ( fieldName, fieldValue ) textItem.IsSummary = True Case NUMBERS: Set numberItem = doc.ReplaceItemValue ( fieldName, fieldValue ) numberItem.IsSummary = True Case DATETIMES: Set dateTimeItem = doc.ReplaceItemValue ( fieldName, fieldValue ) dateTimeItem.IsSummary = True Case UNKNOWN: Set textItem = doc.ReplaceItemValue ( fieldName, fieldValue ) textItem.IsSummary = True Case UNAVAILABLE: Set textItem = doc.ReplaceItemValue ( fieldName, fieldValue ) textItem.IsSummary = True Case SIGNATURES: Set sigItem = doc.ReplaceItemValue ( fieldName, fieldValue ) sigItem.IsSummary = True Case 1281: Set textItem = doc.ReplaceItemValue ( fieldName, fieldValue ) textItem.IsSummary = True Case 1025: Set dateTimeItem = doc.ReplaceItemValue ( fieldName, fieldValue ) dateTimeItemItem.IsSummary = True Case ATTACHMENT: Set attachItem = doc.ReplaceItemValue ( fieldName, fieldValue ) attachItem.IsSummary = True Case ERRORITEM: Set errItem = doc.ReplaceItemValue ( fieldName, fieldValue ) errItem.IsSummary = True Case NAMES: Set nameItem = session.CreateName ( fieldName, fieldValue ) Case AUTHORS: Set authorItem = session.CreateName ( fieldName, fieldValue ) Case READERS: Set readerItem = session.CreateName ( fieldName, fieldValue ) Case NOTEREFS: Set refsItem = doc.ReplaceItemValue ( fieldName, fieldValue ) Case RICHTEXT: Set rtItem = doc.ReplaceItemValue ( fieldName, fieldValue ) rtItem.IsSummary = True Case USERDATA: Set textItem = doc.ReplaceItemValue ( fieldName, fieldValue ) textItem.IsSummary = True Case USERID: Set textItem = doc.ReplaceItemValue ( fieldName, fieldValue ) textItem.IsSummary = True 'Case EMBEDDEDOBJECT: 'Set embeddedItem = doc.ReplaceItemValue( fieldName, fieldValue ) 'Case NOTELINKS: 'Set linksItem = doc.ReplaceItemValue ( fieldName, fieldValue ) Case Else: Set textItem = doc.ReplaceItemValue ( fieldName, fieldValue ) textItem.IsSummary = True End Select End Sub
The code cannot be used because of line breaks from posting to the Web. How about having a version we can copy that does not have line breaks in it?
—Dan J.
******************************************
By clicking on this link, you'll be able to see and/or download this code as the tip's author originally intended it to be. If you have any further questions about it, please don't hesitate to email me. Thanks.
—Matt Gervais, Assistant Site Editor, SearchDomino.com
Do you have comments on this tip? Let us know.
This tip was submitted to the SearchDomino.com tip library by member Genesio Zomparelli. Please let others know how useful it is via the rating scale at the end of the tip. Do you have a useful Lotus Notes, Domino, Workplace or WebSphere tip or code snippet to share? Submit it to our monthly tip contest and you could win a prize.