Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I am using DotNetCore.NPOI (1.2.1) in order to read an MS Excel file.

Some of the cells are of type text and contain formatted strings (e.g. some words in bold).

How do I get the formatted cell value? My final goal: Retrieve the cell text as HTML.

I tried

var cell = row.GetCell(1);
var richStringCellValue = cell.RichStringCellValue;

But this won't let me access the formatted string (just the plain string without formattings).

Does anybody have an idea or solution?

I think you'll have to take longer route in this case. First you'll have to maintain the formatting of cell value like date, currency etc and then extract the style from cell value and embed the cell value under that style.

best option is to write extenstion method to get format and style value.

To get the fomat Please see this link How to get the value of cell containing a date and keep the original formatting using NPOI

For styling first you'll have to check and find the exact style of running text and then return the value inside the html tag , below method will give you idea to extract styling from cell value. Code is untested , you may have to include missing library.

 public void GetStyleOfCellValue()
            XSSFWorkbook wb = new XSSFWorkbook("YourFile.xlsx");
            ISheet sheet = wb.GetSheetAt(0);
            ICell cell = sheet.GetRow(0).GetCell(0);  
            XSSFRichTextString richText = (XSSFRichTextString)cell.RichStringCellValue;
            int formattingRuns = cell.RichStringCellValue.NumFormattingRuns;
            for (int i = 0; i < formattingRuns; i++)
                int startIdx = richText.GetIndexOfFormattingRun(i);
                int length = richText.GetLengthOfFormattingRun(i);
                Console.WriteLine("Text: " + richText.String.Substring(startIdx, startIdx + length));
                if (i == 0)
                    short fontIndex = cell.CellStyle.FontIndex;
                    IFont font = wb.GetFontAt(fontIndex);
                    Console.WriteLine("Bold: " + (font.IsBold)); // return string <b>my string</b>.
                    Console.WriteLine("Italics: " + font.IsItalic + "\n"); // return string <i>my string</i>. 
                    Console.WriteLine("UnderLine: " + font.Underline + "\n"); // return string <u>my string</u>. 
                    IFont fontFormat = richText.GetFontOfFormattingRun(i);
                    Console.WriteLine("Bold: " + (fontFormat.IsBold)); // return string <b>my string</b>.
                    Console.WriteLine("Italics: " + fontFormat.IsItalic + "\n");// return string <i>my string</i>. 
                Ohhhhhhhhh. THIS is how this is supposed to work. The whole string is divided into "FormattingRuns" and I can then check the basic formattings of a single formattingRun. Horrible. ;) But awesome, that you made this clear to me. Thanks, Kumar. This helps me a lot!
– Ingmar
                Oct 19, 2018 at 12:00

Font formatting in XLSX files are stored according to schema http://schemas.openxmlformats.org/spreadsheetml/2006/main which has no direct relationship to HTML tags. Therefore your task is not that much straight forward.

style = cell.getCellStyle();
font = style.getFont(); // or style.getFont(workBook);
// use Font object to query font attributes. E.g. font.IsItalic

Then you will have to build the HTML by appending relevant HTML tags.

Hello TRINE, thank you for your answer. However, I am not sure how this is going to help me. The link you provided does not seem to exist (any more). And the code snippet (I tried: var style = row.GetCell(1).CellStyle; var font = style.GetFont(workbook);) doesn't help me at all. By exploring the font object I couldn't find any properties that would help me analyzing the different formattings inside cell's RichStringCellValue. Am I missing the obvious? – Ingmar Oct 18, 2018 at 17:48 @Ingmar It is not a link. See what are schema namespaces oracle.com/technetwork/articles/… – TRiNE Oct 19, 2018 at 3:54 @Ingmar Btw, this library has limited implementation. Moving from this library will work if it is possible. Can use excel instance; forums.asp.net/t/… – TRiNE Oct 19, 2018 at 4:06 Thanks again for your help, TRiNE. Kumar (above) gave me the idea/information that I needed. And yes, I already figured, that NPOI is a bit limited and somehow uncomfortable to use. However, it seems to be the only solution that is available for free for ASP.NET CORE so far. – Ingmar Oct 19, 2018 at 12:02

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.