⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 cmshtmlconverter.java

📁 cms是开源的框架
💻 JAVA
字号:
/*
 * File   : $Source: /usr/local/cvs/opencms/src/org/opencms/util/CmsHtmlConverter.java,v $
 * Date   : $Date: 2006/04/28 15:20:52 $
 * Version: $Revision: 1.26 $
 *
 * This library is part of OpenCms -
 * the Open Source Content Mananagement System
 *
 * Copyright (c) 2005 Alkacon Software GmbH (http://www.alkacon.com)
 *
 * This library is free software; you can redistribute it and/or
 * modify it under the terms of the GNU Lesser General Public
 * License as published by the Free Software Foundation; either
 * version 2.1 of the License, or (at your option) any later version.
 *
 * This library is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
 * Lesser General Public License for more details.
 *
 * For further information about Alkacon Software GmbH, please see the
 * company website: http://www.alkacon.com
 *
 * For further information about OpenCms, please see the
 * project website: http://www.opencms.org
 * 
 * You should have received a copy of the GNU Lesser General Public
 * License along with this library; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 */

package org.opencms.util;

import org.opencms.file.CmsObject;
import org.opencms.file.CmsProperty;
import org.opencms.file.CmsPropertyDefinition;
import org.opencms.file.CmsResource;
import org.opencms.i18n.CmsEncoder;
import org.opencms.main.CmsException;
import org.opencms.main.CmsLog;

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.UnsupportedEncodingException;
import java.util.ArrayList;
import java.util.List;
import java.util.Properties;
import java.util.StringTokenizer;
import java.util.regex.Pattern;

import org.apache.commons.logging.Log;

import org.w3c.tidy.Tidy;

/**
 * Html cleaner and pretty printer.<p>
 * 
 * Used to clean up html code (e.g. remove word tags) and optionally create xhtml from html.<p>
 *   
 * @author Michael Emmerich 
 * @author Alexander Kandzior
 * 
 * @version $Revision: 1.26 $ 
 * 
 * @since 6.0.0 
 */
public class CmsHtmlConverter {

    /** Param value for disabled mode. **/
    public static final String PARAM_DISABLED = CmsStringUtil.FALSE;

    /** Param value for enabled mode. **/
    public static final String PARAM_ENABLED = CmsStringUtil.TRUE;

    /** Param value for WORD mode. **/
    public static final String PARAM_WORD = "cleanup";

    /** Param value for XHTML mode. **/
    public static final String PARAM_XHTML = "xhtml";

    /** The log object for this class. */
    private static final Log LOG = CmsLog.getLog(CmsHtmlConverter.class);

    /** Regular expression for cleanup. */
    String[] m_cleanupPatterns = {
        "<o:p>.*(\\r\\n)*.*</o:p>",
        "<o:p>.*(\\r\\n)*.*</O:p>",
        "<\\?xml:.*(\\r\\n).*/>",
        "<\\?xml:.*(\\r\\n).*(\\r\\n).*/\\?>",
        "<\\?xml:.*(\\r\\n).*(\\r\\n).*/>",
        "<\\?xml:(.*(\\r\\n)).*/\\?>",
        "<o:SmartTagType.*(\\r\\n)*.*/>",
        "<o:smarttagtype.*(\\r\\n)*.*/>"};

    /** Patterns for cleanup. */
    Pattern[] m_clearStyle;

    /** The input encoding. */
    String m_encoding;

    /** Regular expression for replace. */
    String[] m_replacePatterns = {
        "&#160;",
        "(\\r\\n){2,}",
        "

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -