Get the first 200 characters of a string without breaking HTML tags at the end

bruce
bruce
Member
40 Points
10 Posts

Hi,

I want to get the first few words(100 or 200) from a long summary of words (content may include html) using c#.

My requirement is to display the short description of the long summary of content(this content may include html elements). I'm able to retrieve the plain string but when it is html, the elements are cut it between Example, I get like this

<span style="FONT-FAMILY: Trebuchet MS">Heading</span>
</H3><span style="FONT-FAMILY: Trebuchet MS">
<font style="FONT-SIZE: 15px;


But it should return the string with full html element.

I have a Jquery Editor to get the content from the user.
How can I get the short summary?

Views: 10028
Total Answered: 3
Total Marked As Answer: 1
Posted On: 30-May-2015 22:28

Share:   fb twitter linkedin
Answers
Rahul Maurya
Rahul M...
Teacher
4822 Points
23 Posts
         

 

Hi bruce,

You can use the following method as:

 htmlcontent variable contain html content as well as normal content.

string stringcontent = TrimNewLines(RemoveDoubleNewLines(RemoveAllTags(htmlcontent)));
if(stringcontent.Length > 300) { stringcontent = stringcontent.Substring(0, 300)+" ..."; }

 

/// <summary> 
/// Replaces every tag with new line 
/// </summary>
 
private static string RemoveAllTags(string str)
{
 
string strWithoutTags =
 
Regex.Replace(str, "<[^>]*>", "\n");
 
return strWithoutTags;
}
 
/// <summary> 
/// Replaces sequence of new lines with only one new line 
/// </summary>
 
private static string RemoveDoubleNewLines(string str)
{
 
string pattern = "[\n]+";
 
return Regex.Replace(str, pattern, "\n");
}
 
/// <summary> 
/// Removes new lines from start and end of string 
/// </summary>
 
private static string TrimNewLines(string str)
{
 
int start = 0;
 
while (start < str.Length && str[start] == '\n')
{
start++;
}
 
int end = str.Length - 1;
 
while (end >= 0 && str[end] == '\n')
{
end--;
}
 
if (start > end)
{
 
return string.Empty;
}
 
string trimmed = str.Substring(start, end - start + 1);
 
return trimmed;
}

 

Posted On: 30-May-2015 16:07
bruce
bruce
Member
40 Points
10 Posts
         

Hi Rahul Maurya,

It's working for me.

Posted On: 31-May-2015 07:36
Smith
Smith
None
2568 Points
74 Posts
         

Hi guys,

I found more precise class library to process html content  as:

public static class HtmlProcessing
  {
    static Regex _htmlRegex = new Regex("<.*?>", RegexOptions.Compiled);

    public static string StripTagsRegex(string source)
    {
      if (String.IsNullOrEmpty(source))
        return source;
      return _htmlRegex.Replace(source, string.Empty);
    }

    public static string PrepareString(string source)
    {
      if (String.IsNullOrEmpty(source))
        return source;
      return HttpUtility.HtmlEncode(PrepareText(source));
    }

    public static string PrepareText(string source)
    {
      if (String.IsNullOrEmpty(source))
        return source;
      return StripTagsRegex(
        HttpUtility.HtmlDecode(
          source.Replace("\n", String.Empty)
          .Replace("\r", String.Empty)
          .Replace("\t", String.Empty)));
    }

    public static string GetShortString(string source, int minLength, int maxLength, string delimiter)
    {
      if (String.IsNullOrEmpty(source))
        return source;
      delimiter = String.IsNullOrEmpty(delimiter) ? "" : delimiter;
      if (minLength > maxLength)
        throw new ArgumentException("Wrong minimum and maximum lengths.");
      if (source.Length <= maxLength)
        return source;
      string res = source.Substring(0, maxLength);
      string append = "";
      int pos = res.LastIndexOf(delimiter);
      if (pos < minLength)
      {
        pos = res.LastIndexOf(' ');
        if (pos < minLength)
          pos = maxLength - 4;
        else
          pos = pos - 1;
        append = "...";
      }
      return res.Substring(0, pos + 1) + append;
    }

    public static string ResolveServerUrl(string serverUrl, bool forceHttps)
    {
      if (serverUrl.IndexOf("://") > -1)
        return serverUrl;

      string newUrl = serverUrl;
      Uri originalUri = System.Web.HttpContext.Current.Request.Url;
      newUrl = (forceHttps ? "https" : originalUri.Scheme) +
          "://" + originalUri.Authority + newUrl;
      return newUrl;
    }

    public static string PrepareForUrl(string str, int maxLength)
    {
      if (String.IsNullOrEmpty(str))
        return str;
      string res = System.Text.RegularExpressions.Regex.Replace(str.Trim().ToLower().Replace(" ", "-"), "[^a-zA-Z0-9_-]+", "");
      res = res.Substring(0, Math.Min(res.Length, maxLength));
      return res;
    }

  }

You can use above static method as:

string stringcontent = HtmlProcessing.GetShortString(HtmlProcessing.PrepareText(String.IsNullOrEmpty(htmlcontent) ? "" : htmlcontent), 150, 250, ".");

Hope above class library is very useful.

Posted On: 10-Mar-2017 02:52
 Log In to Chat