<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>kafka0102的边城客栈</title>
	<atom:link href="http://www.kafka0102.com/feed" rel="self" type="application/rss+xml" />
	<link>http://www.kafka0102.com</link>
	<description>要有最朴素的生活与最遥远的梦想，即使明日天寒地冻、路远马亡。</description>
	<lastBuildDate>Sun, 05 Sep 2010 11:50:40 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>[Solr实践]自定义SolrEventListener实现searcher的autowarm策略</title>
		<link>http://www.kafka0102.com/2010/09/326.html</link>
		<comments>http://www.kafka0102.com/2010/09/326.html#comments</comments>
		<pubDate>Sun, 05 Sep 2010 11:50:40 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[solr]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=326</guid>
		<description><![CDATA[	Solr的searcher autowarm（预热）有两个时机，一个是系统启动时（firstSearcher），一个是使用新的searcher替换旧的searcher时（newSearcher）。Solr支持在solrconfig.xml中对SolrCore配置SolrEventListener来实现自定义的autowarm。通常来说，Solr提供的默认实现QuerySenderListener就够用了。在我的需求中，希望solrconfig.xml中配置的SolrEventListener是针对多个SolrCore的，这要是因为我的多个SolrCore共用了一个solrconfig.xml配置。就配置autowarm的查询query来说，简单的就是配置一个常见的query，但如果系统有排序查询（sort），可以配置适宜的sort条件以预热lucene的fieldCache。下面是我自定义的SolrEventListener，效果是，如果SolrCore没有配置query，就使用default的，否则使用自己的。]]></description>
			<content:encoded><![CDATA[<p>	Solr的searcher autowarm（预热）有两个时机，一个是系统启动时（firstSearcher），一个是使用新的searcher替换旧的searcher时（newSearcher）。Solr支持在solrconfig.xml中对SolrCore配置SolrEventListener来实现自定义的autowarm。通常来说，Solr提供的默认实现QuerySenderListener就够用了。在我的需求中，希望solrconfig.xml中配置的SolrEventListener是针对多个SolrCore的，这要是因为我的多个SolrCore共用了一个solrconfig.xml配置。就配置autowarm的查询query来说，简单的就是配置一个常见的query，但如果系统有排序查询（sort），可以配置适宜的sort条件以预热lucene的fieldCache。下面是我自定义的SolrEventListener，效果是，如果SolrCore没有配置query，就使用default的，否则使用自己的。<br />
	实现代码修改自Solr的QuerySenderListener，代码如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> TubaQuerySenderListener <span style="color: #000000; font-weight: bold;">implements</span> SolrEventListener <span style="color: #009900;">&#123;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000000; font-weight: bold;">final</span> Logger logger <span style="color: #339933;">=</span> LoggerFactory
  .<span style="color: #006633;">getLogger</span><span style="color: #009900;">&#40;</span>TubaQuerySenderListener.<span style="color: #000000; font-weight: bold;">class</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">protected</span> <span style="color: #000000; font-weight: bold;">final</span> SolrCore core<span style="color: #339933;">;</span>
  <span style="color: #000000; font-weight: bold;">protected</span> List<span style="color: #339933;">&lt;</span>NamedList<span style="color: #339933;">&gt;</span> queryArgs<span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> init<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> NamedList args<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">String</span> coreName <span style="color: #339933;">=</span> core.<span style="color: #006633;">getName</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    queryArgs <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>List<span style="color: #339933;">&lt;</span>NamedList<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#41;</span>args.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span>coreName<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>queryArgs <span style="color: #339933;">==</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      queryArgs <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>List<span style="color: #339933;">&lt;</span>NamedList<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#41;</span>args.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;default&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>queryArgs <span style="color: #339933;">==</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        logger.<span style="color: #006633;">warn</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;TubaQuerySenderListener not valid for core:&quot;</span><span style="color: #339933;">+</span>coreName<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
    logger.<span style="color: #006633;">info</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;core[&quot;</span><span style="color: #339933;">+</span>coreName<span style="color: #339933;">+</span><span style="color: #0000ff;">&quot;]register TubaQuerySenderListener : &quot;</span> <span style="color: #339933;">+</span> queryArgs<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> TubaQuerySenderListener<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> SolrCore core<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">core</span> <span style="color: #339933;">=</span> core<span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #008000; font-style: italic; font-weight: bold;">/**
   * Add the {@link org.apache.solr.common.params.EventParams#EVENT} with either the {@link org.apache.solr.common.params.EventParams#NEW_SEARCHER}
   * or {@link org.apache.solr.common.params.EventParams#FIRST_SEARCHER} values depending on the value of currentSearcher.
   * &lt;p/&gt;
   * Makes a copy of NamedList and then adds the parameters.
   *
   *
   * @param currentSearcher If null, add FIRST_SEARCHER, otherwise NEW_SEARCHER
   * @param nlst The named list to add the EVENT value to
   */</span>
  <span style="color: #000000; font-weight: bold;">protected</span> NamedList addEventParms<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> SolrIndexSearcher currentSearcher, <span style="color: #000000; font-weight: bold;">final</span> NamedList nlst<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">final</span> NamedList result <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> NamedList<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    result.<span style="color: #006633;">addAll</span><span style="color: #009900;">&#40;</span>nlst<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>currentSearcher <span style="color: #339933;">!=</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      result.<span style="color: #006633;">add</span><span style="color: #009900;">&#40;</span>EventParams.<span style="color: #006633;">EVENT</span>, EventParams.<span style="color: #006633;">NEW_SEARCHER</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">else</span> <span style="color: #009900;">&#123;</span>
      result.<span style="color: #006633;">add</span><span style="color: #009900;">&#40;</span>EventParams.<span style="color: #006633;">EVENT</span>, EventParams.<span style="color: #006633;">FIRST_SEARCHER</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
    <span style="color: #000000; font-weight: bold;">return</span> result<span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> newSearcher<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> SolrIndexSearcher newSearcher, <span style="color: #000000; font-weight: bold;">final</span> SolrIndexSearcher currentSearcher<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>queryArgs <span style="color: #339933;">==</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000000; font-weight: bold;">return</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
    <span style="color: #000000; font-weight: bold;">final</span> SolrIndexSearcher searcher <span style="color: #339933;">=</span> newSearcher<span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> NamedList nlst <span style="color: #339933;">:</span> queryArgs<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #666666; font-style: italic;">// bind the request to a particular searcher (the newSearcher)</span>
        <span style="color: #000000; font-weight: bold;">final</span> NamedList params <span style="color: #339933;">=</span> addEventParms<span style="color: #009900;">&#40;</span>currentSearcher, nlst<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">final</span> LocalSolrQueryRequest req <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> LocalSolrQueryRequest<span style="color: #009900;">&#40;</span>core,params<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
          @Override <span style="color: #000000; font-weight: bold;">public</span> SolrIndexSearcher getSearcher<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span> <span style="color: #000000; font-weight: bold;">return</span> searcher<span style="color: #339933;">;</span> <span style="color: #009900;">&#125;</span>
          @Override <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> close<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span> <span style="color: #009900;">&#125;</span>
        <span style="color: #009900;">&#125;</span><span style="color: #339933;">;</span>
&nbsp;
        <span style="color: #000000; font-weight: bold;">final</span> SolrQueryResponse rsp <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> SolrQueryResponse<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        core.<span style="color: #006633;">execute</span><span style="color: #009900;">&#40;</span>core.<span style="color: #006633;">getRequestHandler</span><span style="color: #009900;">&#40;</span>req.<span style="color: #006633;">getParams</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span>CommonParams.<span style="color: #006633;">QT</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span>, req, rsp<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #666666; font-style: italic;">// Retrieve the Document instances (not just the ids) to warm</span>
        <span style="color: #666666; font-style: italic;">// the OS disk cache, and any Solr document cache.  Only the top</span>
        <span style="color: #666666; font-style: italic;">// level values in the NamedList are checked for DocLists.</span>
        <span style="color: #000000; font-weight: bold;">final</span> NamedList values <span style="color: #339933;">=</span> rsp.<span style="color: #006633;">getValues</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> i<span style="color: #339933;">=</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> i<span style="color: #339933;">&lt;</span>values.<span style="color: #006633;">size</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> i<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
          <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> o <span style="color: #339933;">=</span> values.<span style="color: #006633;">getVal</span><span style="color: #009900;">&#40;</span>i<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
          <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>o <span style="color: #000000; font-weight: bold;">instanceof</span> DocList<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #000000; font-weight: bold;">final</span> DocList docs <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>DocList<span style="color: #009900;">&#41;</span>o<span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> DocIterator iter <span style="color: #339933;">=</span> docs.<span style="color: #006633;">iterator</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> iter.<span style="color: #006633;">hasNext</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
              newSearcher.<span style="color: #006633;">doc</span><span style="color: #009900;">&#40;</span>iter.<span style="color: #006633;">nextDoc</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span>
          <span style="color: #009900;">&#125;</span>
        <span style="color: #009900;">&#125;</span>
        req.<span style="color: #006633;">close</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #666666; font-style: italic;">// do nothing... we want to continue with the other requests.</span>
        <span style="color: #666666; font-style: italic;">// the failure should have already been logged.</span>
        logger.<span style="color: #006633;">warn</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;&quot;</span>,e<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
    logger.<span style="color: #006633;">info</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;core[&quot;</span><span style="color: #339933;">+</span>core.<span style="color: #006633;">getName</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">+</span><span style="color: #0000ff;">&quot;]TubaQuerySenderListener newSearcher done.&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> postCommit<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">throw</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">UnsupportedOperationException</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>	solrconfig.xml中的配置示例如下，其中firstSearcher和newSearcher的配置是一样的，不过这不意味着它们是必须一样的。</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;">	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;query<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
		<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;listener</span> <span style="color: #000066;">event</span>=<span style="color: #ff0000;">&quot;firstSearcher&quot;</span></span>
<span style="color: #009900;">			<span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;com.tintintech.tuba.search.TubaQuerySenderListener&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
			<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;arr</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;default&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
				<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;lst<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
					<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;q&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>手机<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
					<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;start&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>0<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
					<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;rows&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>10<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
				<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/lst<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
			<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/arr<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
			<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;arr</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;core1&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
				<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;lst<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
					<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;q&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>手机<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
					<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;start&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>0<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
					<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;rows&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>10<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
					<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;sort&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>at desc<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
				<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/lst<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
			<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/arr<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
		<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/listener<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
&nbsp;
		<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;listener</span> <span style="color: #000066;">event</span>=<span style="color: #ff0000;">&quot;newSearcher&quot;</span></span>
<span style="color: #009900;">			<span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;com.tintintech.tuba.search.TubaQuerySenderListener&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
			<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;arr</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;default&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
				<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;lst<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
					<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;q&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>手机<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
					<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;start&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>0<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
					<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;rows&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>10<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
				<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/lst<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
			<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/arr<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
			<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;arr</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;core1&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
				<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;lst<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
					<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;q&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>手机<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
					<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;start&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>0<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
					<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;rows&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>10<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
					<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;sort&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>at desc<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
				<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/lst<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
			<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/arr<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
		<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/listener<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/query<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/09/326.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Solr之困</title>
		<link>http://www.kafka0102.com/2010/08/319.html</link>
		<comments>http://www.kafka0102.com/2010/08/319.html#comments</comments>
		<pubDate>Sat, 21 Aug 2010 20:36:37 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[solr]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=319</guid>
		<description><![CDATA[重写公司的站内搜索。经过前期一段时间对lucene和solr的熟悉，最后决定使用Solr作为新系统的基础框架。现在已经是第一阶段开发的后期，核心代码行数有11000+（不包含admin及client等）。现已实现的功能要比已有系统要丰富些，但综合比较两个系统总的代码量，其实新系统并不多得太多。新系统使用Solr代替了已有系统实现的部分功能，这减少了新系统的代码量，同是新系统实现了已有系统不具有的功能，也增加了一些代码量。开发的这段时间，因为新系统中很多代码是独立于Solr的，所以和Solr的交互也是时断时续，以使得即便到了开发后期我还能发现Solr实现的一些细节带给我的困扰。]]></description>
			<content:encoded><![CDATA[<p>重写公司的站内搜索。经过前期一段时间对lucene和solr的熟悉，最后决定使用Solr作为新系统的基础框架。现在已经是第一阶段开发的后期，核心代码行数有11000+（不包含admin及client等）。现已实现的功能要比已有系统要丰富些，但综合比较两个系统总的代码量，其实新系统并不多得太多。新系统使用Solr代替了已有系统实现的部分功能，这减少了新系统的代码量，同是新系统实现了已有系统不具有的功能，也增加了一些代码量。开发的这段时间，因为新系统中很多代码是独立于Solr的，所以和Solr的交互也是时断时续，以使得即便到了开发后期我还能发现Solr实现的一些细节带给我的困扰。</p>
<p>抛开我所做的系统来说，如果要选择一个站内搜索解决方案，Solr在某些场景下可能是个很不错的选择。因为Solr提供了Web server支持通过Http来更新索引、重建索引、查询等功能，如果需求对上Solr，甚至可以不需要基于Solr做二次开发就直接满足需要。多美妙的事情阿。不过，如果你需要些高级功能，那么可能你需要基于Solr做些工作了。比如，如果索引库很大，可以将索引库拆成多个shard，查询时对多个shard进行，这个功能Solr是支持的；不过，建索引的事情就需要自己搞定了，比如在Solr前面加个Proxy（或者只是个库函数），在建索引时根据特定的策略提交到不同的shard上。这个其实也还好了，但如果我需要一个涉及到多个索引库（各索引库有不同的schema）的查询，比如要做整站搜索，那么Solr的shard查询就用不上了，因为它必须要求各shard的schema一致。而我要做的实际是个通用搜索，这样的问题就有些接踵而至了。尽管和Solr磨合的过程花了不少时间，涉及到对它提供的功能、设计、源码的理解等等，并且有时还要妥协它开发，有时还要舍弃它已实现的功能而另起炉灶。但不可否认的是，对于初涉站内搜索开发的我来说，使用Solr并不是太坏的选择，从中也学到了Solr优秀的地方，同时也看到它不足的地方，都是收获。本文会简单的总结下个人在应用Solr过程中一些不是很爽的地方，爽的地方姑且按下不表。</p>
<p>Solr实现上有个核心东西，就是SolrCore。每个SolrCore对应着一个索引库，几乎所有的操作都是针对单个SolrCore进行的，似乎Solr的初衷就是如此，并没有考虑到多个SolrCore之间的关联。所以，可以看到的是，每个SolrQueryRequest都会关联到一个SolrCore，SolrRequestHandler的获得也是从SolrCore取得的。这糟糕的设计使得，当需要对多个SolrCore做管理时，Solr不得不做出CoreAdminHandler，它虽然实现了SolrRequestHandler接口，但它是脱离于SolrCore的，使得配置上也和其他handler不一样。而Solr的shard查询的支持就更糟糕，它要求shard的SolrCore的schema都是一致的，而不能查询异构的SolrCore。为了解决这个问题，我在Solr基础上加了个VirtualCore（这个概念现在看起来不是很好，或许IndexCore会更好些），VirtualCore里面可以包含一个或多个SolrCore，而很多操作就不是针对SolrCore而是针对VirtualCore了。比如索引库index被拆分成index.0、index.1、index.2，无论索引还是查询，客户端只需要向系统针对core=index进行请求，无需关心index被系统拆分成几个库，这些库被如何分布，系统会通过配置把这些事情做好。对于整站多个库的联合查询，就是针对多个VirtualCore进行，可以通过配置指定各个VirtualCore的请求参数而不需要像Solr那样有严格的约束。</p>
<p>引入了VirtualCore，使得Solr的一些实现不能得手的使用上。首当其冲的就是它的SearchHandler，我不得不在它的基础上重写了一个，它的shard请求异常处理策略也很有问题，如果shard请求中的某个出现异常，它就不会返回结果，这样做的好处是保证返回结果的全局准确性，但却降低了可用性。这里也需要考虑到查询结果cache的问题，如果在Solr前面加了查询结果Cache，那么Solr这种准确性要求就是有必要的。但在我的实现中，是可以有多少shard返回就处理多少，但在异常的情况下就不做查询结果cache处理。</p>
<p>VirtualCore也使得Solr强悍的DIH也用不上了，但即便没有VirtualCore，DIH也很难解决单点提交多个shard索引的问题。DIH直接对索引的SolrCore做重建索引处理，并没有对重建索引过程提供灵活的hook（尽管它确实提供了一些hook）。就我的需求来说，我希望每索引一个文档同时会根据一定的策略来更新摘要数据库，我浏览了DIH的文档和代码，似乎很难做到。而且，DIH是直接在现有索引上做重建的，如果重建时间很长或者出现问题，使得同时进来的更新索引被阻塞，就会影响到正常的服务。</p>
<p>Solr对配置文件的把握上也不够好。Solr对solrconfig.xml文件提供了Java属性值替换配置文件变量，但solr.xml却没有支持，使得线上线下配置文件中充斥着不同的绝对路径。也有好的一方面，比如schema.xml支持Xinclude，使得多个索引库的schema.xml可以共用相同的field type定义。不过，如果多个索引库的schema能集中在一个文件而不是散落成多个文件，管理起来会更方便。这样的问题同时也存在于solrconfig.xml，尽管solrconfig.xml大多数项的配置都是通用的，不过多个索引库时，searcher的warm请求参数可能就会不一样，这使得我在考虑安排时间改写它的默认Lisnter的实现。</p>
<p>Solr的索引复制有一个细节，那就是master和slave保持长连接，master通过调用OutputStream的flush方法不断把数据发送给slave，如果使用Servlet容器，通过Servlet得到OutputStream这样做没什么问题，但如果使用Netty作为服务器框架，并且使用Netty的http实现，那就实现不了这个效果。这使得我不得不放弃Netty改用Jetty了。</p>
<p>再回到查询上，Solr的SearchHandler只会得到doc id list，而不会得到需要的所请求的字段内容，它是在ResponseWriter输出时根据doc id从IndexReader得到需要的字段。在我的设计中，索引只会存储逻辑主键id，得到逻辑主键id后再从另外的摘要库把其他字段取回（或者就是返回id列表给客户端），但我显然需要在ResponseWriter输出前做完这些事情，这使得我并不得不修改request需要返回的字段列表为空。而这个ResponseWriter是需要和SolrCore的schema绑定的，结果对于并不存在的VirtualCore，我还不得不使用上配置为空并且没有索引的fake schema蒙混过去。</p>
<p>还是关于配置，Solr复制slave端配置的master url需要指定参数core，这使得每个SolrCore都有不同的master url而不能共用一个solrconfig.xml，而我真的很希望它们能共用一个solrconfig.xml。其实这个core参数在ReplicationHandler中完全可以得到，Solr没这么做的一个可能的原因是，它支持的请求url格式是http://host/corename/qt?xx=dd，把corename作为url path的一部分让我用起来很不爽，所以我把请求的格式格式改成：http://host/qt?core=aaa&amp;xx=dd，并出下策把Solr和复制相关的代码拷过来，增加了几行代码完事。</p>
<p>问题当然还有，但就像上面提到的，遇到问题总要找个解决方案，尽管有的方案看起来有些二。在回想上面提到的问题之后，我对现在完成的产出的可用性有些怀疑，我到现在还没有完整的测试过这个系统，所以，它还需要我更仔细的打磨。值得庆幸的是，随着对Solr了解的深入，我能更好的驾驭它了。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/08/319.html/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>HttpClient的“Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.”警告释疑</title>
		<link>http://www.kafka0102.com/2010/08/316.html</link>
		<comments>http://www.kafka0102.com/2010/08/316.html#comments</comments>
		<pubDate>Sat, 21 Aug 2010 07:43:19 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[httpclient]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=316</guid>
		<description><![CDATA[使用HttpClient，总是报出“Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.”的WARN日志，定位到HttpClient的源码如下：

    public byte&#91;&#93; getResponseBody&#40;&#41; throws IOException &#123;
        if &#40;this.responseBody == null&#41; &#123;
            InputStream instream = getResponseBodyAsStream&#40;&#41;;
     [...]]]></description>
			<content:encoded><![CDATA[<p>使用HttpClient，总是报出“Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.”的WARN日志，定位到HttpClient的源码如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> getResponseBody<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">throws</span> <span style="color: #003399;">IOException</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">responseBody</span> <span style="color: #339933;">==</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #003399;">InputStream</span> instream <span style="color: #339933;">=</span> getResponseBodyAsStream<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>instream <span style="color: #339933;">!=</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                <span style="color: #000066; font-weight: bold;">long</span> contentLength <span style="color: #339933;">=</span> getResponseContentLength<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>contentLength <span style="color: #339933;">&gt;</span> <span style="color: #003399;">Integer</span>.<span style="color: #006633;">MAX_VALUE</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span> <span style="color: #666666; font-style: italic;">//guard below cast from overflow</span>
                    <span style="color: #000000; font-weight: bold;">throw</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">IOException</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;Content too large to be buffered: &quot;</span><span style="color: #339933;">+</span> contentLength <span style="color: #339933;">+</span><span style="color: #0000ff;">&quot; bytes&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                <span style="color: #009900;">&#125;</span>
                <span style="color: #000066; font-weight: bold;">int</span> limit <span style="color: #339933;">=</span> getParams<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">getIntParameter</span><span style="color: #009900;">&#40;</span>HttpMethodParams.<span style="color: #006633;">BUFFER_WARN_TRIGGER_LIMIT</span>, <span style="color: #cc66cc;">1024</span><span style="color: #339933;">*</span><span style="color: #cc66cc;">1024</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>contentLength <span style="color: #339933;">==</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">||</span> <span style="color: #009900;">&#40;</span>contentLength <span style="color: #339933;">&gt;</span> limit<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                    LOG.<span style="color: #006633;">warn</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;Going to buffer response body of large or unknown size. &quot;</span>
                            <span style="color: #339933;">+</span><span style="color: #0000ff;">&quot;Using getResponseBodyAsStream instead is recommended.&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                <span style="color: #009900;">&#125;</span>
                LOG.<span style="color: #006633;">debug</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;Buffering response body&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                <span style="color: #003399;">ByteArrayOutputStream</span> outstream <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">ByteArrayOutputStream</span><span style="color: #009900;">&#40;</span>
                        contentLength <span style="color: #339933;">&gt;</span> <span style="color: #cc66cc;">0</span> <span style="color: #339933;">?</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span><span style="color: #009900;">&#41;</span> contentLength <span style="color: #339933;">:</span> DEFAULT_INITIAL_BUFFER_SIZE<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                <span style="color: #000066; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> buffer <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #000066; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">4096</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
                <span style="color: #000066; font-weight: bold;">int</span> len<span style="color: #339933;">;</span>
                <span style="color: #000000; font-weight: bold;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>len <span style="color: #339933;">=</span> instream.<span style="color: #006633;">read</span><span style="color: #009900;">&#40;</span>buffer<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&gt;</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                    outstream.<span style="color: #006633;">write</span><span style="color: #009900;">&#40;</span>buffer, <span style="color: #cc66cc;">0</span>, len<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                <span style="color: #009900;">&#125;</span>
                outstream.<span style="color: #006633;">close</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                setResponseStream<span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">responseBody</span> <span style="color: #339933;">=</span> outstream.<span style="color: #006633;">toByteArray</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span>
        <span style="color: #009900;">&#125;</span>
        <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">responseBody</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span></pre></div></div>

<p>    报WARN的条件是 ((contentLength == -1) || (contentLength > limit))，也就是说，或者是返回的HTTP头没有指定contentLength，或者是contentLength大于上限（默认是1M）。如果能确定返回结果的大小对程序没有显著影响，这个WARN就可以忽略，可以在日志的配置中把HttpClient的日志级别调到ERROR，不让它报出来。</p>
<p>    当然，这个警告也是有意义的，HttpClient建议使用InputStream getResponseBodyAsStream()代替byte[] getResponseBody()。对于返回结果很大或无法预知的情况，就需要使用InputStream getResponseBodyAsStream()，避免byte[] getResponseBody()可能带来的内存的耗尽问题。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/08/316.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>构建健壮的Java基准测试</title>
		<link>http://www.kafka0102.com/2010/08/312.html</link>
		<comments>http://www.kafka0102.com/2010/08/312.html#comments</comments>
		<pubDate>Sat, 14 Aug 2010 17:34:49 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[java benchmark]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=312</guid>
		<description><![CDATA[本周遇到几篇和基准测试相关的不错的文章，如果不是因为上周末鼓弄了一下各种锁的性能测试，我或许就会错过它们。两篇文章来自dw，分别是 Robust Java benchmarking, Part 1: Issues 和  Robust Java benchmarking, Part 2: Statistics and solutions，作者还有个专页 Java benchmarking article 提供一个Java基准测试的框架，感兴趣的可参考之。本文算是对Robust Java benchmarking, Part 1: Issues的一个简单的总结。在阅读Robust Java benchmarking两篇文章的过程中，我也看了些其中的参考文章，也有一些不错的可以拜读之。]]></description>
			<content:encoded><![CDATA[<p>本周遇到几篇和基准测试相关的不错的文章，如果不是因为上周末鼓弄了一下各种锁的性能测试，我或许就会错过它们。两篇文章来自dw，分别是<a href="http://www.ibm.com/developerworks/library/j-benchmark1.html" target="_blank"> Robust Java benchmarking, Part 1: Issues</a> 和 <a href="http://www.ibm.com/developerworks/java/library/j-benchmark2/index.html" target="_blank"> Robust Java benchmarking, Part 2: Statistics and solutions</a>，作者还有个专页<a href="http://www.ellipticgroup.com/html/benchmarkingArticle.html" target="_blank"> Java benchmarking article</a> 提供一个Java基准测试的框架，感兴趣的可参考之。本文算是对Robust Java benchmarking, Part 1: Issues的一个简单的总结。在阅读Robust Java benchmarking两篇文章的过程中，我也看了些其中的参考文章，也有一些不错的可以拜读之。</p>
<h2>1、度量执行时间</h2>
<p>基准测试通常的过程是：1）记录开始时间，2）执行代码，3）记录结束时间，4）计算时间差。Java中记录时间程序员们经常使用System.currentTimeMillis，不过该方法在精度上有偏差，这个偏差依操作系统的不同而不同，比如Win98可能会偏差55ms，Linux2.6可能偏差1ms。如果测试的task执行时间在毫秒级别，这个偏差就会影响结果的正确性了。JDK1.5中引入的System.nanoTime就是更好的选择，它在精度和准确性方面表现得都更好。而这种偏差以及其他因素可能引起的偏差，都在提醒我们，做基准测试时要使得task执行较长的时间。</p>
<h2>2、代码预热</h2>
<p>Java的执行过程是很复杂的，这其中会有很多因素影响到基准测试。通常来说，Java代码在开始执行阶段会相对很慢，之后会越来越快，直到达到稳定阶段，这一过程涉及的影响因素主要有：<br />
1）类加载。类加载涉及到文件读取、解析、校验等系列操作，所以在计算真正的task执行前需要先执行几遍task确保类加载都完成了。如果task涉及的条件分支很多，要确保各分支的代码都覆盖到。<br />
2）及时编译。JVM执行的是Java代码被翻译成的字节码，字节码的解释执行速度多数情况下是要比执行机器码慢的。所以，在Java代码执行过程中，JVM会根据执行情况将“热”的字节码编译成机器码加速执行，这就是JIT。Sun（Oracle？）的HotSpot JVM有两种启动模式，即client和server。启动时，client相比server做的优化更少，所以启动得更快。在运行时，同样一段代码，client需要1500次调用server需要10000次调用，使得JVM会将这段“热”代码编译成机器码。所以，如果要使得基准测试是在稳定阶段进行，就需要JVM将task中的代码动态编译成机器码。可以使用CompilationMXBean.getTotalCompilationTime函数以及-XX:+PrintCompilation启动参数查看及时编译情况。<br />
为解决上述两点提到的问题，一个可行的执行基准测试的过程如下：<br />
1. 执行task一次去加载所有的类。<br />
2. 执行task足够多次确保JVM的执行达到稳定阶段。<br />
3. 执行task一些次以得到task执行时间的评估值。<br />
4. 使用步骤3计算n，n是接下来task的执行次数，它要使得task的累积执行时间是足够的大。<br />
5. 度量执行n次task的总的执行时间。<br />
6. 评估单次（t/n）task的执行时间。</p>
<h2>3、动态优化</h2>
<p>动态编译不是一劳永逸的，它还涉及到一系列问题，突出的问题如下：<br />
1）Deoptimization：在Java程序运行过程中，同一段代码并不一定只被编译一次，JVM可能根据执行情况对已经编译过的代码段重新编译，以取得更好的优化效果。因为JVM通常编译的是一段“热”代码，但在运行时可能“热”代码周边的代码（比如一些并不经常走的分支）也“热”起来，这时整个代码段就可能需要重新编译。<br />
2）On-stack replacement：早期的HotSpot在编译“热”方法时，是在下次调用该方法时使用编译的字节码，本次的方法执行还是解释执行。所以，如果该次方法调用后再不调用这一方法，那么这个编译就没什么价值。比如一个极端的例子，在main函数里执行长循环操作，尽管HotSpot及时编译了机器码，但在循环执行过程中用不上，当循环结束，main也结束了，结果白白编译了机器码。所以，HotSpot后来引入了OSR，可以在方法执行过程中字节码替换成机器码。尽管OSR看起来很好，但OSR往往没有对编译的代码做最优化处理，这对基准测试来说就不是最好的选择。为避免OSR的问题，通常不要把所有的代码放到一个方法里。<br />
3）Dead-code elimination：Dead-code的情况很多，比如调用一个有返回值的方法，但返回值从来没有被调用者接收处理，JVM就可以对此做优化。这种优化在很多语言里都有，但对基准测试来说，这可不是好的情况。所以，写基准测试代码时要注意这个问题。</p>
<h2>4、Resource reclamation</h2>
<p>garbage collection and object finalization (GC/OF)是JVM自己的行为，如果程序员任其发展，它往往会在你不知道的时刻影响到基准测试。减少GC/OF的影响，通常有两种方法：1）是task运行时间长些，平衡掉GC的影响，2）是执行足够遍的task，并在每次执行后做System.gc和System.runFinalization处理，化被动为主动。</p>
<h2>5、Caching</h2>
<p>OS Cache和CPU Cache有时也会影响到基准测试。如果是测试文件IO操作，就不能忽略OS Cache的影响。如果是针对数值做测试，就可能要考虑CPU Cache的影响。</p>
<h2>6、其他因素</h2>
<p>除了上述几点，一些外界因素也需要考虑，比如测试执行过程中其他运行中的程序的影响，硬件的影响，JVM参数的影响等等。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/08/312.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Java中各种锁类型的基准性能评测</title>
		<link>http://www.kafka0102.com/2010/08/298.html</link>
		<comments>http://www.kafka0102.com/2010/08/298.html#comments</comments>
		<pubDate>Tue, 10 Aug 2010 15:58:03 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[HashMap]]></category>
		<category><![CDATA[lock]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=298</guid>
		<description><![CDATA[周末对Java中各种类型的锁做了基准评测。测试的条件有两个：1）是10、50、100个不同的并发线程，2）是读写比例近似1:1,10:1,100:1,1000:1。测试方法是，对各种加锁的Map方法做性能评测，它们都是实现了MapWrapper接口的封装，测试的就是Map的get和put方法。测试的锁类型有：1）hashtable：直接测试Hashtable，2）synclock：对HashMap的方法直接加synchronized（理论上性能应和Hashtable相当），3）mutexlock：对HashMap的方法显示加Lock锁，4）rwlock：对HashMap加读写锁，5）concrrent：直接使用ConcurrentHashMap的方法，6）对HashMap读操作不加锁，写操作加Lock。]]></description>
			<content:encoded><![CDATA[<p>周末对Java中各种类型的锁做了基准评测。测试的条件有两个：1）是10、50、100个不同的并发线程，2）是读写比例近似1:1,10:1,100:1,1000:1。测试方法是，对各种加锁的Map方法做性能评测，它们都是实现了MapWrapper接口的封装，测试的就是Map的get和put方法。测试的锁类型有：1）hashtable：直接测试Hashtable，2）synclock：对HashMap的方法直接加synchronized（理论上性能应和Hashtable相当），3）mutexlock：对HashMap的方法显示加Lock锁，4）rwlock：对HashMap加读写锁，5）concrrent：直接使用ConcurrentHashMap的方法，6）对HashMap读操作不加锁，写操作加Lock。针对每种测试条件，起的每个线程执行特定次数（10万或100万）次读写操作，写入的内容是随机生成的整数，每种条件下一个线程要跑三遍，取总的处理时间的平均值作为计算结果。<br />
评测的代码如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">interface</span> MapWrapper <span style="color: #009900;">&#123;</span>
&nbsp;
  <span style="color: #000066; font-weight: bold;">void</span> put<span style="color: #009900;">&#40;</span><span style="color: #003399;">Object</span> key,<span style="color: #003399;">Object</span> value<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #003399;">Object</span> get<span style="color: #009900;">&#40;</span><span style="color: #003399;">Object</span> key<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #000066; font-weight: bold;">void</span> clear<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #003399;">String</span> getName<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> HashTableMapWrapper <span style="color: #000000; font-weight: bold;">implements</span> MapWrapper<span style="color: #009900;">&#123;</span>
  <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> Map<span style="color: #339933;">&lt;</span>Object,Object<span style="color: #339933;">&gt;</span> map<span style="color: #339933;">;</span>
&nbsp;
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> HashTableMapWrapper<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    map <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Hashtable<span style="color: #339933;">&lt;</span>Object,Object<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> clear<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    map.<span style="color: #006633;">clear</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">Object</span> get<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> key<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">return</span> map.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span>key<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> put<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> key, <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> value<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    map.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>key, value<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">String</span> getName<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #0000ff;">&quot;hashtable&quot;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> SyncMapWrapper  <span style="color: #000000; font-weight: bold;">implements</span> MapWrapper<span style="color: #009900;">&#123;</span>
  <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> Map<span style="color: #339933;">&lt;</span>Object,Object<span style="color: #339933;">&gt;</span> map<span style="color: #339933;">;</span>
&nbsp;
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> SyncMapWrapper<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    map <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> HashMap<span style="color: #339933;">&lt;</span>Object,Object<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">synchronized</span> <span style="color: #000066; font-weight: bold;">void</span> clear<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    map.<span style="color: #006633;">clear</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">synchronized</span> <span style="color: #003399;">Object</span>  get<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> key<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">return</span> map.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span>key<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">synchronized</span> <span style="color: #000066; font-weight: bold;">void</span> put<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> key, <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> value<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    map.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>key, value<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">String</span> getName<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #0000ff;">&quot;synclock&quot;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> LockMapWrapper <span style="color: #000000; font-weight: bold;">implements</span> MapWrapper<span style="color: #009900;">&#123;</span>
  <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> Map<span style="color: #339933;">&lt;</span>Object,Object<span style="color: #339933;">&gt;</span> map<span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> Lock lock<span style="color: #339933;">;</span>
&nbsp;
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> LockMapWrapper<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    map <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> HashMap<span style="color: #339933;">&lt;</span>Object,Object<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    lock <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ReentrantLock<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> clear<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    lock.<span style="color: #006633;">lock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
      map.<span style="color: #006633;">clear</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">finally</span> <span style="color: #009900;">&#123;</span>
      lock.<span style="color: #006633;">unlock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">Object</span> get<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> key<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    lock.<span style="color: #006633;">lock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000000; font-weight: bold;">return</span> map.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span>key<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #666666; font-style: italic;">// TODO: handle exception</span>
    <span style="color: #009900;">&#125;</span><span style="color: #000000; font-weight: bold;">finally</span> <span style="color: #009900;">&#123;</span>
      lock.<span style="color: #006633;">unlock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
    <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> put<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> key, <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> value<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    lock.<span style="color: #006633;">lock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
      map.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>key, value<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">finally</span> <span style="color: #009900;">&#123;</span>
      lock.<span style="color: #006633;">unlock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">String</span> getName<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #0000ff;">&quot;mutexlock&quot;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> RWLockMapWrapper <span style="color: #000000; font-weight: bold;">implements</span> MapWrapper<span style="color: #009900;">&#123;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> Map<span style="color: #339933;">&lt;</span>Object,Object<span style="color: #339933;">&gt;</span> map<span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> Lock readLock<span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> Lock writeLock<span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> RWLockMapWrapper<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    map <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> HashMap<span style="color: #339933;">&lt;</span>Object,Object<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">final</span> ReentrantReadWriteLock lock <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ReentrantReadWriteLock<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    readLock <span style="color: #339933;">=</span> lock.<span style="color: #006633;">readLock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    writeLock <span style="color: #339933;">=</span> lock.<span style="color: #006633;">writeLock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> clear<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    writeLock.<span style="color: #006633;">lock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
      map.<span style="color: #006633;">clear</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">finally</span> <span style="color: #009900;">&#123;</span>
      writeLock.<span style="color: #006633;">unlock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">Object</span> get<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> key<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    readLock.<span style="color: #006633;">lock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000000; font-weight: bold;">return</span> map.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span>key<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #666666; font-style: italic;">// TODO: handle exception</span>
    <span style="color: #009900;">&#125;</span><span style="color: #000000; font-weight: bold;">finally</span> <span style="color: #009900;">&#123;</span>
      readLock.<span style="color: #006633;">unlock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
    <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> put<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> key, <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> value<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    writeLock.<span style="color: #006633;">lock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
      map.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>key, value<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">finally</span> <span style="color: #009900;">&#123;</span>
      writeLock.<span style="color: #006633;">unlock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">String</span> getName<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #0000ff;">&quot;rwlock&quot;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> ConcurrentMapWrapper  <span style="color: #000000; font-weight: bold;">implements</span> MapWrapper<span style="color: #009900;">&#123;</span>
  <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> Map<span style="color: #339933;">&lt;</span>Object,Object<span style="color: #339933;">&gt;</span> map<span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> ConcurrentMapWrapper<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    map <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ConcurrentHashMap<span style="color: #339933;">&lt;</span>Object,Object<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> clear<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    map.<span style="color: #006633;">clear</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">Object</span> get<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> key<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">return</span> map.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span>key<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> put<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> key, <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> value<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    map.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>key, value<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">String</span> getName<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #0000ff;">&quot;concrrent&quot;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> WriteLockMapWrapper <span style="color: #000000; font-weight: bold;">implements</span> MapWrapper<span style="color: #009900;">&#123;</span>
  <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> Map<span style="color: #339933;">&lt;</span>Object,Object<span style="color: #339933;">&gt;</span> map<span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> Lock lock<span style="color: #339933;">;</span>
&nbsp;
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> WriteLockMapWrapper<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    map <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> HashMap<span style="color: #339933;">&lt;</span>Object,Object<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    lock <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ReentrantLock<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> clear<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    lock.<span style="color: #006633;">lock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
      map.<span style="color: #006633;">clear</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">finally</span> <span style="color: #009900;">&#123;</span>
      lock.<span style="color: #006633;">unlock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">Object</span> get<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> key<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">return</span> map.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span>key<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> put<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> key, <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> value<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    lock.<span style="color: #006633;">lock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
      map.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>key, value<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">finally</span> <span style="color: #009900;">&#123;</span>
      lock.<span style="color: #006633;">unlock</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  @Override
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">String</span> getName<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #0000ff;">&quot;writelock&quot;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
&nbsp;
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> BenchMark <span style="color: #009900;">&#123;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">volatile</span> <span style="color: #000066; font-weight: bold;">long</span> totalTime<span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">private</span> CountDownLatch latch<span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> loop<span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> threads<span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">float</span> ratio<span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> BenchMark<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> loop, <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> threads,<span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">float</span> ratio<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">super</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">loop</span> <span style="color: #339933;">=</span> loop<span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">threads</span> <span style="color: #339933;">=</span> threads<span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">ratio</span> <span style="color: #339933;">=</span> ratio<span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">class</span> BenchMarkRunnable <span style="color: #000000; font-weight: bold;">implements</span> <span style="color: #003399;">Runnable</span><span style="color: #009900;">&#123;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> MapWrapper mapWrapper<span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> size<span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> benchmarkRandomReadPut<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> MapWrapper mapWrapper,<span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> loop<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Random</span> random <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">Random</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #000066; font-weight: bold;">int</span> writeTime <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
      <span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> i<span style="color: #339933;">=</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>i<span style="color: #339933;">&lt;</span>loop<span style="color: #339933;">;</span>i<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> n <span style="color: #339933;">=</span> random.<span style="color: #006633;">nextInt</span><span style="color: #009900;">&#40;</span>size<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>mapWrapper.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span>n<span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
          mapWrapper.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>n, n<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
          writeTime<span style="color: #339933;">++;</span>
        <span style="color: #009900;">&#125;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> BenchMarkRunnable<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> MapWrapper mapWrapper,<span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> size<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">mapWrapper</span> <span style="color: #339933;">=</span> mapWrapper<span style="color: #339933;">;</span>
      <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">size</span> <span style="color: #339933;">=</span> size<span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    @Override
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> run<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">long</span> start <span style="color: #339933;">=</span> <span style="color: #003399;">System</span>.<span style="color: #006633;">currentTimeMillis</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      benchmarkRandomReadPut<span style="color: #009900;">&#40;</span>mapWrapper,loop<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">long</span> end <span style="color: #339933;">=</span> <span style="color: #003399;">System</span>.<span style="color: #006633;">currentTimeMillis</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      totalTime <span style="color: #339933;">+=</span> end<span style="color: #339933;">-</span>start<span style="color: #339933;">;</span>
      latch.<span style="color: #006633;">countDown</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> benchmark<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> MapWrapper mapWrapper<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">float</span> size <span style="color: #339933;">=</span> loop<span style="color: #339933;">*</span>threads<span style="color: #339933;">*</span>ratio<span style="color: #339933;">;</span>
    totalTime <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> k<span style="color: #339933;">=</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>k<span style="color: #339933;">&lt;</span><span style="color: #cc66cc;">3</span><span style="color: #339933;">;</span>k<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      latch <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> CountDownLatch<span style="color: #009900;">&#40;</span>threads<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> i<span style="color: #339933;">=</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>i<span style="color: #339933;">&lt;</span>threads<span style="color: #339933;">;</span>i<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">Thread</span><span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> BenchMarkRunnable<span style="color: #009900;">&#40;</span>mapWrapper,<span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span><span style="color: #009900;">&#41;</span>size<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">start</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span>
      <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
        latch.<span style="color: #006633;">await</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        e.<span style="color: #006633;">printStackTrace</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span>
      mapWrapper.<span style="color: #006633;">clear</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #003399;">Runtime</span>.<span style="color: #006633;">getRuntime</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">gc</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #003399;">Runtime</span>.<span style="color: #006633;">getRuntime</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">runFinalization</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #003399;">Thread</span>.<span style="color: #006633;">sleep</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1000</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
    <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> rwratio <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1.0</span><span style="color: #339933;">/</span>ratio<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;[&quot;</span><span style="color: #339933;">+</span>mapWrapper.<span style="color: #006633;">getName</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">+</span><span style="color: #0000ff;">&quot;]threadnum[&quot;</span><span style="color: #339933;">+</span>threads<span style="color: #339933;">+</span><span style="color: #0000ff;">&quot;]ratio[&quot;</span><span style="color: #339933;">+</span>rwratio<span style="color: #339933;">+</span><span style="color: #0000ff;">&quot;]avgtime[&quot;</span><span style="color: #339933;">+</span>totalTime<span style="color: #339933;">/</span><span style="color: #cc66cc;">3</span><span style="color: #339933;">+</span><span style="color: #0000ff;">&quot;]&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> benchmark2<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> loop, <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> threads,<span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">float</span> ratio<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">final</span> BenchMark benchMark <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> BenchMark<span style="color: #009900;">&#40;</span>loop,threads,ratio<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">final</span> MapWrapper<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> wrappers <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> MapWrapper<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">new</span> HashTableMapWrapper<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>,
        <span style="color: #000000; font-weight: bold;">new</span> SyncMapWrapper<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>,
        <span style="color: #000000; font-weight: bold;">new</span> LockMapWrapper<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>,
        <span style="color: #000000; font-weight: bold;">new</span> RWLockMapWrapper<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>,
        <span style="color: #000000; font-weight: bold;">new</span> ConcurrentMapWrapper<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>,
        <span style="color: #000000; font-weight: bold;">new</span> WriteLockMapWrapper<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>,
    <span style="color: #009900;">&#125;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> MapWrapper wrapper <span style="color: #339933;">:</span> wrappers<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      benchMark.<span style="color: #006633;">benchmark</span><span style="color: #009900;">&#40;</span>wrapper<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> test<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    benchmark2<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1000000</span>,<span style="color: #cc66cc;">10</span>,<span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//r:w 1:1</span>
    benchmark2<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1000000</span>,<span style="color: #cc66cc;">10</span>,0.1f<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//r:w 10:1</span>
    benchmark2<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1000000</span>,<span style="color: #cc66cc;">10</span>,0.01f<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//r:w 100:1</span>
    benchmark2<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1000000</span>,<span style="color: #cc66cc;">10</span>,0.001f<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//r:w 1000:1</span>
    <span style="color: #666666; font-style: italic;">/////</span>
    benchmark2<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1000000</span>,<span style="color: #cc66cc;">50</span>,0.1f<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//r:w 10:1</span>
    benchmark2<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1000000</span>,<span style="color: #cc66cc;">50</span>,0.01f<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//r:w 100:1</span>
    benchmark2<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1000000</span>,<span style="color: #cc66cc;">50</span>,0.001f<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//r:w 1000:1</span>
    <span style="color: #666666; font-style: italic;">/////</span>
    benchmark2<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">100000</span>,<span style="color: #cc66cc;">100</span>,1f<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//r:w 10:1</span>
    benchmark2<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">100000</span>,<span style="color: #cc66cc;">100</span>,0.1f<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//r:w 10:1</span>
    benchmark2<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">100000</span>,<span style="color: #cc66cc;">100</span>,0.01f<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//r:w 100:1</span>
    benchmark2<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">100000</span>,<span style="color: #cc66cc;">100</span>,0.001f<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//r:w 1000:1</span>
    <span style="color: #666666; font-style: italic;">////</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> main<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">String</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> args<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    test<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>测试是在本机进行的，JVM参数是： -Xms2048M -Xmx2048M -XX:+UseParallelGC -XX:+AggressiveOpts -XX:+UseFastAccessorMethods -XX:+HeapDumpOnOutOfMemoryError。依次是并发10、50、100个线程测试得到的数据表如下：<br />
<a href="http://www.kafka0102.com/wp-content/uploads/2010/08/lock_data.png"><img src="http://www.kafka0102.com/wp-content/uploads/2010/08/lock_data.png" alt="" title="lock_data" width="556" height="375" class="aligncenter size-full wp-image-305" /></a><br />
</br><br />
基于上面的数据使用openoffice做出的曲线对比图如下：<br />
1、并发10个线程的平均耗时曲线对比图：<br />
<a href="http://www.kafka0102.com/wp-content/uploads/2010/08/10_lock.jpg"><img src="http://www.kafka0102.com/wp-content/uploads/2010/08/10_lock.jpg" alt="" title="10_lock" width="599" height="330" class="aligncenter size-full wp-image-306" /></a><br />
<br />
2、并发50个线程的平均耗时曲线对比图：<br />
<a href="http://www.kafka0102.com/wp-content/uploads/2010/08/50_lock.jpg"><img src="http://www.kafka0102.com/wp-content/uploads/2010/08/50_lock.jpg" alt="" title="50_lock" width="604" height="439" class="aligncenter size-full wp-image-307" /></a><br />
<br />
3、并发100个线程的平均耗时曲线对比图：<br />
<a href="http://www.kafka0102.com/wp-content/uploads/2010/08/100_lock.jpg"><img src="http://www.kafka0102.com/wp-content/uploads/2010/08/100_lock.jpg" alt="" title="100_lock" width="598" height="362" class="aligncenter size-full wp-image-308" /></a><br />
<br />
对于测试数据来说，不必太当真，同样的用例跑几遍数据都会有偏差，但总体上数据间的对比应该基本一致的。对于上面的测试结果，其中并发50个线程时没有读写1:1的数据，这其实是我的一个失误，到整理结果时才发现。但在并发100个线程时，我调整了单线程的循环次数，从上面的100万次调整到10万次，这是因为在100万次时我的机器已经严重CPU耗尽，久久不出结果，并伴有大量全GC。</p>
<p>根据测试结果，可以得出下面的结论：<br />
1、读写比例相近（1:1）时，各种并发的速度差不多，只有ConcurrentHashMap因为分段加锁，性能稍好些。<br />
2、在几种并发及读写比例中，hashtable都表现得很差，即便是和显示加互斥锁的synclock和mutexlock相比。<br />
3、并发线程越多，锁的影响就越有体现。如果加锁的方法处理耗时更长，这种对比就更加明显（比如文件操作等）。<br />
4、直观的数据对比来说，在100线程并发读写1000:1的最大化条件下，表现最差的hashtable和表现最好的concurrent性能比是11:1,其他条件下的比例都在这之内。<br />
5、对比rwlock和writelock（rwlock加读写锁，writelock读不加锁写加互斥锁），读写锁的开销还是有一些的，因读写比例的不同而有几倍的差距。因为concurrent和writelock的区别在于concurrent是分段加锁，所以它俩的读写比例大时差别不大。<br />
6、综合分析，concurrent这种读不加锁、写加分段锁的做法是效果最好的（尽管读有时会脏读的情况）。</p>
<p>对于这种基准测试，如果你有更好更精确的方法，也请不吝分享之。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/08/298.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>[Solr源码分析]LRUCache和FastLRUCache实现分析</title>
		<link>http://www.kafka0102.com/2010/08/293.html</link>
		<comments>http://www.kafka0102.com/2010/08/293.html#comments</comments>
		<pubDate>Sun, 08 Aug 2010 16:01:42 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[cache]]></category>
		<category><![CDATA[solr]]></category>
		<category><![CDATA[HashMap]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[源码分析]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=293</guid>
		<description><![CDATA[	在 [Solr 实践]Solr Cache使用介绍及分析 一文我有对Solr的LRUCache和FastLRUCache做了一些介绍，本文在此基础对其实现做些补充。
1、LRUCache的实现分析
	在分析LRUCache前先对LinkedHashMap做些介绍。LinkedHashMap继承于HashMap，它使用了一个双向链表来存储Map中的Entry顺序关系，这种顺序有两种，一种是LRU顺序，一种是插入顺序，这可以由其构造函数public LinkedHashMap(int initialCapacity,float loadFactor,                   boolean accessOrder)指定。所以，对于get、put、remove等操作，LinkedHashMap除了要做HashMap做的事情，还做些调整Entry顺序链表的工作。
	以get操作为例，如果是LRU顺序（accessOrder为true），Entry的recordAccess方法就调整get到的Entry到链表的头部去：

   public V get&#40;Object key&#41; &#123;
        Entry&#60;K,V&#62; e = &#40;Entry&#60;K,V&#62;&#41;getEntry&#40;key&#41;;
        if &#40;e [...]]]></description>
			<content:encoded><![CDATA[<p>	在 <a href="http://www.kafka0102.com/2010/08/267.html" target="blank">[Solr 实践]Solr Cache使用介绍及分析</a> 一文我有对Solr的LRUCache和FastLRUCache做了一些介绍，本文在此基础对其实现做些补充。</p>
<h2>1、LRUCache的实现分析</h2>
<p>	在分析LRUCache前先对LinkedHashMap做些介绍。LinkedHashMap继承于HashMap，它使用了一个双向链表来存储Map中的Entry顺序关系，这种顺序有两种，一种是LRU顺序，一种是插入顺序，这可以由其构造函数public LinkedHashMap(int initialCapacity,float loadFactor,                   boolean accessOrder)指定。所以，对于get、put、remove等操作，LinkedHashMap除了要做HashMap做的事情，还做些调整Entry顺序链表的工作。<br />
	以get操作为例，如果是LRU顺序（accessOrder为true），Entry的recordAccess方法就调整get到的Entry到链表的头部去：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">   <span style="color: #000000; font-weight: bold;">public</span> V get<span style="color: #009900;">&#40;</span><span style="color: #003399;">Object</span> key<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        Entry<span style="color: #339933;">&lt;</span>K,V<span style="color: #339933;">&gt;</span> e <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>Entry<span style="color: #339933;">&lt;</span>K,V<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#41;</span>getEntry<span style="color: #009900;">&#40;</span>key<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>e <span style="color: #339933;">==</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span>
            <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #339933;">;</span>
        e.<span style="color: #006633;">recordAccess</span><span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">this</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">return</span> e.<span style="color: #006633;">value</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span></pre></div></div>

<p>    对于put来说，LinkedHashMap重写了addEntry方法：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">   <span style="color: #000066; font-weight: bold;">void</span> addEntry<span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> hash, K key, V value, <span style="color: #000066; font-weight: bold;">int</span> bucketIndex<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        createEntry<span style="color: #009900;">&#40;</span>hash, key, value, bucketIndex<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #666666; font-style: italic;">// Remove eldest entry if instructed, else grow capacity if appropriate</span>
        Entry<span style="color: #339933;">&lt;</span>K,V<span style="color: #339933;">&gt;</span> eldest <span style="color: #339933;">=</span> header.<span style="color: #006633;">after</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>removeEldestEntry<span style="color: #009900;">&#40;</span>eldest<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            removeEntryForKey<span style="color: #009900;">&#40;</span>eldest.<span style="color: #006633;">key</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">else</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>size <span style="color: #339933;">&gt;=</span> threshold<span style="color: #009900;">&#41;</span>
                resize<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">2</span> <span style="color: #339933;">*</span> table.<span style="color: #006633;">length</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span></pre></div></div>

<p>    addEntry中调用了boolean removeEldestEntry(Map.Entry<K,V> eldest)方法，默认实现一直返回false，也就是默认的Map是没有容量限制的。LinkedHashMap的子类可以复写该方法，当当前的size大于阈值时返回true，这样LinkedHashMap就可以从Entry顺序链表中删除最旧的Entry。这使得LinkedHashMap具有了Cache的功能，可以存储限量的元素，并具有两种可选的元素淘汰策略（LRU和FIFO），其中的LRU是最常用的。<br />
	Solr的LRUCache是基于LinkedHashMap实现的，所以LRUCache的实现真的很简单，这里列出其中核心的代码片断：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">Object</span> init<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Map</span> args, <span style="color: #003399;">Object</span> persistence, <span style="color: #000000; font-weight: bold;">final</span> CacheRegenerator regenerator<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
	<span style="color: #666666; font-style: italic;">//一堆解析参数参数初始化的代码</span>
	<span style="color: #666666; font-style: italic;">//map map    </span>
    map <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> LinkedHashMap<span style="color: #009900;">&#40;</span>initialSize, 0.75f, <span style="color: #000066; font-weight: bold;">true</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      @Override
      <span style="color: #000000; font-weight: bold;">protected</span> <span style="color: #000066; font-weight: bold;">boolean</span> removeEldestEntry<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Map.<span style="color: #006633;">Entry</span></span> eldest<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>size<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&gt;</span> limit<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
          <span style="color: #666666; font-style: italic;">// increment evictions regardless of state.</span>
          <span style="color: #666666; font-style: italic;">// this doesn't need to be synchronized because it will</span>
          <span style="color: #666666; font-style: italic;">// only be called in the context of a higher level synchronized block.</span>
          evictions<span style="color: #339933;">++;</span>
          stats.<span style="color: #006633;">evictions</span>.<span style="color: #006633;">incrementAndGet</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
          <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000066; font-weight: bold;">true</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
        <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000066; font-weight: bold;">false</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>persistence<span style="color: #339933;">==</span><span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #666666; font-style: italic;">// must be the first time a cache of this type is being created</span>
      persistence <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> CumulativeStats<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
    stats <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>CumulativeStats<span style="color: #009900;">&#41;</span>persistence<span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">return</span> persistence<span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">Object</span> put<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> key, <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> value<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">synchronized</span> <span style="color: #009900;">&#40;</span>map<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>state <span style="color: #339933;">==</span> State.<span style="color: #006633;">LIVE</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        stats.<span style="color: #006633;">inserts</span>.<span style="color: #006633;">incrementAndGet</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span>
      <span style="color: #666666; font-style: italic;">// increment local inserts regardless of state???</span>
      <span style="color: #666666; font-style: italic;">// it does make it more consistent with the current size...</span>
      inserts<span style="color: #339933;">++;</span>
      <span style="color: #000000; font-weight: bold;">return</span> map.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>key,value<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">Object</span> get<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> key<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">synchronized</span> <span style="color: #009900;">&#40;</span>map<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Object</span> val <span style="color: #339933;">=</span> map.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span>key<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>state <span style="color: #339933;">==</span> State.<span style="color: #006633;">LIVE</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #666666; font-style: italic;">// only increment lookups and hits if we are live.</span>
        lookups<span style="color: #339933;">++;</span>
        stats.<span style="color: #006633;">lookups</span>.<span style="color: #006633;">incrementAndGet</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>val<span style="color: #339933;">!=</span><span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
          hits<span style="color: #339933;">++;</span>
          stats.<span style="color: #006633;">hits</span>.<span style="color: #006633;">incrementAndGet</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
      <span style="color: #009900;">&#125;</span>
      <span style="color: #000000; font-weight: bold;">return</span> val<span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
  <span style="color: #009900;">&#125;</span></pre></div></div>

<p>  可以看到，LRUCache对读写操作直接加的互斥锁，多线程并发读写时会有锁的竞争问题。通常来说，Cache系统的读要远多于写，不能并发读是有些不够友好。不过，相比于Solr中其它耗时的操作来说，LRUCache的串行化读往往不会成为系统的瓶颈。LRUCache的优点是，直接套用LinkedHashMap，实现简单，缺点是，因为LinkedHashMap的get操作需要操作Entry顺序链表，所以必须对整个操作加锁。</p>
<h2>2、FastLRUCache的实现分析</h2>
<p>	Solr1.4引入FastLRUCache作为另一种可选的实现。FastLRUCache放弃了LinkedHashMap，而是使用现在很多Java Cache实现中使用的ConcurrentHashMap。但ConcurrentHashMap只提供了高性能的并发存取支持，并没有提供对淘汰数据的支持，所以FastLRUCache主要需要做的就是这件事情。FastLRUCache的存取操作都在ConcurrentLRUCache中实现，所以我们直接过渡到ConcurrentLRUCache的实现。<br />
	ConcurrentLRUCache的存取操作代码如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">  <span style="color: #000000; font-weight: bold;">public</span> V get<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> K key<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">final</span> CacheEntry<span style="color: #339933;">&lt;</span>K,V<span style="color: #339933;">&gt;</span> e <span style="color: #339933;">=</span> map.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span>key<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>e <span style="color: #339933;">==</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>islive<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        stats.<span style="color: #006633;">missCounter</span>.<span style="color: #006633;">incrementAndGet</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span>
      <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>islive<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      e.<span style="color: #006633;">lastAccessed</span> <span style="color: #339933;">=</span> stats.<span style="color: #006633;">accessCounter</span>.<span style="color: #006633;">incrementAndGet</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
    <span style="color: #000000; font-weight: bold;">return</span> e.<span style="color: #006633;">value</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> V remove<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> K key<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">final</span> CacheEntry<span style="color: #339933;">&lt;</span>K,V<span style="color: #339933;">&gt;</span> cacheEntry <span style="color: #339933;">=</span> map.<span style="color: #006633;">remove</span><span style="color: #009900;">&#40;</span>key<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>cacheEntry <span style="color: #339933;">!=</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      stats.<span style="color: #006633;">size</span>.<span style="color: #006633;">decrementAndGet</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #000000; font-weight: bold;">return</span> cacheEntry.<span style="color: #006633;">value</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
    <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">Object</span> put<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> K key, <span style="color: #000000; font-weight: bold;">final</span> V val<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>val <span style="color: #339933;">==</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
    <span style="color: #000000; font-weight: bold;">final</span> CacheEntry e <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> CacheEntry<span style="color: #009900;">&#40;</span>key, val, stats.<span style="color: #006633;">accessCounter</span>.<span style="color: #006633;">incrementAndGet</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">final</span> CacheEntry oldCacheEntry <span style="color: #339933;">=</span> map.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span>key, e<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000066; font-weight: bold;">int</span> currentSize<span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>oldCacheEntry <span style="color: #339933;">==</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      currentSize <span style="color: #339933;">=</span> stats.<span style="color: #006633;">size</span>.<span style="color: #006633;">incrementAndGet</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">else</span> <span style="color: #009900;">&#123;</span>
      currentSize <span style="color: #339933;">=</span> stats.<span style="color: #006633;">size</span>.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>islive<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      stats.<span style="color: #006633;">putCounter</span>.<span style="color: #006633;">incrementAndGet</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">else</span> <span style="color: #009900;">&#123;</span>
      stats.<span style="color: #006633;">nonLivePutCounter</span>.<span style="color: #006633;">incrementAndGet</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #666666; font-style: italic;">// Check if we need to clear out old entries from the cache.</span>
    <span style="color: #666666; font-style: italic;">// isCleaning variable is checked instead of markAndSweepLock.isLocked()</span>
    <span style="color: #666666; font-style: italic;">// for performance because every put invokation will check until</span>
    <span style="color: #666666; font-style: italic;">// the size is back to an acceptable level.</span>
    <span style="color: #666666; font-style: italic;">// There is a race between the check and the call to markAndSweep, but</span>
    <span style="color: #666666; font-style: italic;">// it's unimportant because markAndSweep actually aquires the lock or returns if it can't.</span>
    <span style="color: #666666; font-style: italic;">// Thread safety note: isCleaning read is piggybacked (comes after) other volatile reads</span>
    <span style="color: #666666; font-style: italic;">// in this method.</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>currentSize <span style="color: #339933;">&gt;</span> upperWaterMark <span style="color: #339933;">&amp;&amp;</span> <span style="color: #339933;">!</span>isCleaning<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>newThreadForCleanup<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">Thread</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
          @Override
          <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> run<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            markAndSweep<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
          <span style="color: #009900;">&#125;</span>
        <span style="color: #009900;">&#125;</span>.<span style="color: #006633;">start</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">else</span> <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>cleanupThread <span style="color: #339933;">!=</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
        cleanupThread.<span style="color: #006633;">wakeThread</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">else</span> <span style="color: #009900;">&#123;</span>
        markAndSweep<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
    <span style="color: #000000; font-weight: bold;">return</span> oldCacheEntry <span style="color: #339933;">==</span> <span style="color: #000066; font-weight: bold;">null</span> <span style="color: #339933;">?</span> <span style="color: #000066; font-weight: bold;">null</span> <span style="color: #339933;">:</span> oldCacheEntry.<span style="color: #006633;">value</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span></pre></div></div>

<p>	所有的操作都是直接调用map（ConcurrentHashMap）的。看下put中的代码，当map容量达到上限并且没有其他线程在清理数据（currentSize > upperWaterMark &#038;&#038; !isCleaning），就调用markAndSweep方法清理数据，可以有3种方式做清理工作：1）在该线程同步执行，2）即时启动新线程异步执行，3）提供单独的清理线程，即时唤醒它异步执行。</p>
<p>	markAndSweep方法那是相当的冗长，这里就不罗列出来。下面叙述下它的思路。</p>
<p>	对于ConcurrentLRUCache中的每一个元素CacheEntry，它有个属性lastAccessed，表示最后访问的数值大小。ConcurrentLRUCache中的stats.accessCounter是全局的自增整数，当put或get Entry时，Entry的lastAccessed会被更新成新自增得到的accessCounter。		ConcurrentLRUCache淘汰数据就是淘汰那些lastAccessed较小的Entry。因为ConcurrentLRUCache没有维护以lastAccessed排序的Entry链表（否则就是LRUCache了），所以淘汰数据时就需要遍历整个Map中的元素来淘汰合适的Entry。这是不是要扯上排序呢？其实不用那么大动干戈。</p>
<p>	这里定义几个变量，wantToKeep表示Map中需要保留的Entry个数，wantToRemove表示需要删除的个数（wantToRemove=map.size-wantToKeep),newestEntry是最大的lastAccessed值（初始是stats.accessCounter），这三个变量初始都是已知的，oldestEntry表示最小的lastAccessed，这个是未知的，可以在遍历Entry时通过比较递进到最小。Map中的Entry有3种:(a)是可以立刻判断出可以被淘汰的，也就是lastAccessed<(oldestEntry+wantToRemove)的，（b）是可以立刻判断出可以被保留的，也就是lastAccessed>(newestEntry-1000)的，（c）除上述两者之外的就是不能准确判断是否需要被淘汰的。对于遍历一趟Map中的Entry来说，极好的情况是如果淘汰掉满足（a）的Entry后Map大小降到了wantToKeep，这种情况的典型代表是对Cache只有get和put操作，使得lastAccessed在Map中能保持连续；极坏的情况是，可能满足（a）的Entry不够多甚至没有。但遍历一趟Map至少有一个效果是，会把需要处理的Entry范围缩小到满足（c）的。如此反复迭代，一定使得Map容量调到wantToKeep。而对这个淘汰，也要考虑一个现实情况是，wantToKeep往往是接近于map.size（比如等于0.9*map.size）的，如果remove操作不是很多，那么并不需要很多次遍历就可以完成清理工作。</p>
<p>	ConcurrentLRUCache淘汰数据的基本思想如上所述。它的执行过程可以分为3个阶段。第一个阶段就是遍历Map中的每个Entry，如果满足（a）就remove，满足（b）则跳过，满足（c）则放到新map中。一遍下来后，如果map.size还大于wantToKeep，第二个阶段就再重复上述过程（实现上，Solr用了个变量numPasses，似乎想做个开关控制遍历几次，当前就固定成一次）。完了如果map.size还大于wantToKeep，第三阶段再遍历一遍Map，但这次使用PriorityQueue来提取出还需要再淘汰的N个最old的Entry，这样一次下来就收工了。需要补充一点，上面提到的wantToKeep在代码中是acceptableWaterMark和lowerWaterMark，也就是如果遍历后达到acceptableWaterMark就算完成，但操作是按lowerWaterMark的要求来。</p>
<p>	这个算法的时间复杂度是2n+kln(k)（k值在实际大多数情况下会很小），相比于直接的堆排，通常会更快些。</p>
<h2>3、总结</h2>
<p>	LRUCache和FastLRUCache两种Cache实现是两种很不同的思路。两者的相同点是，都使用了现成的Map来维护数据。不同点是如何来淘汰数据。LRUCache（也就是LinkedHashMap）格外维护了一个结构，在做存取操作时同时更新该结构，优点在于淘汰操作是O(1)的，缺点是需要对存取操作加互斥锁。FastLRUCache正相反，它没有额外维护新的结构，可以由ConcurrentHashMap支持并发读，但put操作中如果需要淘汰数据，淘汰过程是O(n)的，因为整个过程不加锁，这也只会影响该次put的性能，而FastLRUCache也可选成起独立线程异步执行来降低影响。而另一个Cache实现Ehcache，它在淘汰数据就是同步的，不过它限定了每次淘汰数据的大小（通常都少于5个），所以同步情况下性能不会太受影响。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/08/293.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>分析多线程并发写HashMap线程被hang住的原因</title>
		<link>http://www.kafka0102.com/2010/08/286.html</link>
		<comments>http://www.kafka0102.com/2010/08/286.html#comments</comments>
		<pubDate>Fri, 06 Aug 2010 21:05:39 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[HashMap]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=286</guid>
		<description><![CDATA[在blogjava上看到一文  谁能帮忙解释一下为什么这个程序会死锁？，激发了我那能害死猫的好奇，所以很费劲的琢磨了这个问题。由于涉及的内容较多，就单独发文阐述一下。
文中提到的问题程序如下：

public class TestLock &#123;
  private final HashMap map = new HashMap&#40;&#41;;
  public TestLock&#40;&#41; &#123;
    final Thread t1 = new Thread&#40;&#41; &#123;
      @Override
      public void run&#40;&#41; &#123;
        for&#40;int i=0; i&#60;500000; i++&#41; &#123;
 [...]]]></description>
			<content:encoded><![CDATA[<p>在blogjava上看到一文 <a href="http://www.blogjava.net/zhvfeng/archive/2010/08/04/327956.html" target="blank"> 谁能帮忙解释一下为什么这个程序会死锁？</a>，激发了我那能害死猫的好奇，所以很费劲的琢磨了这个问题。由于涉及的内容较多，就单独发文阐述一下。</p>
<p>文中提到的问题程序如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> TestLock <span style="color: #009900;">&#123;</span>
  <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">HashMap</span> map <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">HashMap</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #000000; font-weight: bold;">public</span> TestLock<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Thread</span> t1 <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">Thread</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      @Override
      <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> run<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> i<span style="color: #339933;">=</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> i<span style="color: #339933;">&lt;</span><span style="color: #cc66cc;">500000</span><span style="color: #339933;">;</span> i<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
          map.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">Integer</span><span style="color: #009900;">&#40;</span>i<span style="color: #009900;">&#41;</span>, i<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
        <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;t1 over&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Thread</span> t2 <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">Thread</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      @Override
      <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> run<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> i<span style="color: #339933;">=</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> i<span style="color: #339933;">&lt;</span><span style="color: #cc66cc;">500000</span><span style="color: #339933;">;</span> i<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
          map.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">Integer</span><span style="color: #009900;">&#40;</span>i<span style="color: #009900;">&#41;</span>, i<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
        <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;t2 over&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span><span style="color: #339933;">;</span>
    t1.<span style="color: #006633;">start</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    t2.<span style="color: #006633;">start</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> main<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">String</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> args<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">new</span> TestLock<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>	就是启了两个线程，不断的往一个非线程安全的HashMap中put内容，put的内容很简单，key和value都是从0自增的整数（这个put的内容做的并不好，以致于后来干扰了我分析问题的思路）。对HashMap做并发写操作，我原以为只不过会产生脏数据的情况，但反复运行这个程序，会出现线程t1、t2被hang住的情况，多数情况下是一个线程被hang住另一个成功结束，偶尔会两个线程都被hang住。说到这里，你如果觉得不好好学习ConcurrentHashMap而在这瞎折腾就手下留情跳过吧。<br />
	好吧，分析下HashMap的put函数源码看看问题出在哪，这里就罗列出相关代码（jdk1.6）：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> V put<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> K key, <span style="color: #000000; font-weight: bold;">final</span> V value<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>key <span style="color: #339933;">==</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #000000; font-weight: bold;">return</span> putForNullKey<span style="color: #009900;">&#40;</span>value<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
    <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> hash <span style="color: #339933;">=</span> hash<span style="color: #009900;">&#40;</span>key.<span style="color: #006633;">hashCode</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> i <span style="color: #339933;">=</span> indexFor<span style="color: #009900;">&#40;</span>hash, table.<span style="color: #006633;">length</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span>Entry<span style="color: #339933;">&lt;</span>K,V<span style="color: #339933;">&gt;</span> e <span style="color: #339933;">=</span> table<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span> e <span style="color: #339933;">!=</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #339933;">;</span> e <span style="color: #339933;">=</span> e.<span style="color: #006633;">next</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span><span style="color: #666666; font-style: italic;">//@标记1</span>
      <span style="color: #003399;">Object</span> k<span style="color: #339933;">;</span>
      <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>e.<span style="color: #006633;">hash</span> <span style="color: #339933;">==</span> hash <span style="color: #339933;">&amp;&amp;</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>k <span style="color: #339933;">=</span> e.<span style="color: #006633;">key</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> key <span style="color: #339933;">||</span> key.<span style="color: #006633;">equals</span><span style="color: #009900;">&#40;</span>k<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">final</span> V oldValue <span style="color: #339933;">=</span> e.<span style="color: #006633;">value</span><span style="color: #339933;">;</span>
        e.<span style="color: #006633;">value</span> <span style="color: #339933;">=</span> value<span style="color: #339933;">;</span>
        e.<span style="color: #006633;">recordAccess</span><span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">this</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">return</span> oldValue<span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    modCount<span style="color: #339933;">++;</span>
    addEntry<span style="color: #009900;">&#40;</span>hash, key, value, i<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000066; font-weight: bold;">void</span> addEntry<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> hash, <span style="color: #000000; font-weight: bold;">final</span> K key, <span style="color: #000000; font-weight: bold;">final</span> V value, <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> bucketIndex<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">final</span> Entry<span style="color: #339933;">&lt;</span>K,V<span style="color: #339933;">&gt;</span> e <span style="color: #339933;">=</span> table<span style="color: #009900;">&#91;</span>bucketIndex<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
    table<span style="color: #009900;">&#91;</span>bucketIndex<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Entry<span style="color: #339933;">&lt;</span>K,V<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span>hash, key, value, e<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>size<span style="color: #339933;">++</span> <span style="color: #339933;">&gt;=</span> threshold<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      resize<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">2</span> <span style="color: #339933;">*</span> table.<span style="color: #006633;">length</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000066; font-weight: bold;">void</span> resize<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> newCapacity<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">final</span> Entry<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> oldTable <span style="color: #339933;">=</span> table<span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> oldCapacity <span style="color: #339933;">=</span> oldTable.<span style="color: #006633;">length</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>oldCapacity <span style="color: #339933;">==</span> MAXIMUM_CAPACITY<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      threshold <span style="color: #339933;">=</span> <span style="color: #003399;">Integer</span>.<span style="color: #006633;">MAX_VALUE</span><span style="color: #339933;">;</span>
      <span style="color: #000000; font-weight: bold;">return</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">final</span> Entry<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> newTable <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Entry<span style="color: #009900;">&#91;</span>newCapacity<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
    transfer<span style="color: #009900;">&#40;</span>newTable<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    table <span style="color: #339933;">=</span> newTable<span style="color: #339933;">;</span>
    threshold <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#40;</span>newCapacity <span style="color: #339933;">*</span> loadFactor<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000066; font-weight: bold;">void</span> transfer<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> Entry<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> newTable<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">final</span> Entry<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> src <span style="color: #339933;">=</span> table<span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> newCapacity <span style="color: #339933;">=</span> newTable.<span style="color: #006633;">length</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">long</span> time1 <span style="color: #339933;">=</span> <span style="color: #003399;">System</span>.<span style="color: #006633;">currentTimeMillis</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> j <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> j <span style="color: #339933;">&lt;</span> src.<span style="color: #006633;">length</span><span style="color: #339933;">;</span> j<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      Entry<span style="color: #339933;">&lt;</span>K,V<span style="color: #339933;">&gt;</span> e <span style="color: #339933;">=</span> src<span style="color: #009900;">&#91;</span>j<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
      <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>e <span style="color: #339933;">!=</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        src<span style="color: #009900;">&#91;</span>j<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #339933;">;</span>
        <span style="color: #000066; font-weight: bold;">int</span> k<span style="color: #339933;">=</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//@标记2</span>
        <span style="color: #000000; font-weight: bold;">do</span> <span style="color: #009900;">&#123;</span>
          <span style="color: #000000; font-weight: bold;">final</span> Entry<span style="color: #339933;">&lt;</span>K,V<span style="color: #339933;">&gt;</span> next <span style="color: #339933;">=</span> e.<span style="color: #006633;">next</span><span style="color: #339933;">;</span>
          <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> i <span style="color: #339933;">=</span> indexFor<span style="color: #009900;">&#40;</span>e.<span style="color: #006633;">hash</span>, newCapacity<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
          e.<span style="color: #006633;">next</span> <span style="color: #339933;">=</span> newTable<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
          newTable<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> e<span style="color: #339933;">;</span>
          e <span style="color: #339933;">=</span> next<span style="color: #339933;">;</span>
          <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>k<span style="color: #339933;">++</span> <span style="color: #339933;">&gt;</span> <span style="color: #cc66cc;">1000</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span><span style="color: #666666; font-style: italic;">//@标记3</span>
            <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span><span style="color: #003399;">Thread</span>.<span style="color: #006633;">currentThread</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">getName</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">+</span>
                <span style="color: #0000ff;">&quot;,e==next:&quot;</span><span style="color: #339933;">+</span><span style="color: #009900;">&#40;</span>e<span style="color: #339933;">==</span>e.<span style="color: #006633;">next</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">+</span><span style="color: #0000ff;">&quot;,e==next.next:&quot;</span><span style="color: #339933;">+</span><span style="color: #009900;">&#40;</span>e<span style="color: #339933;">==</span>e.<span style="color: #006633;">next</span>.<span style="color: #006633;">next</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">+</span>
                <span style="color: #0000ff;">&quot;,e:&quot;</span><span style="color: #339933;">+</span>e<span style="color: #339933;">+</span><span style="color: #0000ff;">&quot;,next:&quot;</span><span style="color: #339933;">+</span>e.<span style="color: #006633;">next</span><span style="color: #339933;">+</span><span style="color: #0000ff;">&quot;,eq:&quot;</span><span style="color: #339933;">+</span>e.<span style="color: #006633;">equals</span><span style="color: #009900;">&#40;</span>e.<span style="color: #006633;">next</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
              <span style="color: #003399;">Thread</span>.<span style="color: #006633;">sleep</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">2000</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">Exception</span> e2<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #009900;">&#125;</span>
&nbsp;
          <span style="color: #009900;">&#125;</span>
        <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">while</span> <span style="color: #009900;">&#40;</span>e <span style="color: #339933;">!=</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
  <span style="color: #009900;">&#125;</span></pre></div></div>

<p>  通过jconsole（或者thread dump），可以看到线程停在了transfer方法的while循环处。这个transfer方法的作用是，当Map中元素数超过阈值需要resize时，它负责把原Map中的元素映射到新Map中。我修改了HashMap，加上了@标记2和@标记3的代码片断，以打印出死循环时的状态，结果死循环线程总是出现类似这样的输出：“Thread-1,e==next:false,e==next.next:true,e:108928=108928,next:108928=108928,eq:true”。<br />
这个输出表明：<br />
  1）这个Entry链中的两个Entry之间的关系是：e=e.next.next，造成死循环。<br />
  2）e.equals(e.next)，但e!=e.next。因为测试例子中两个线程put的内容一样，并发时可能同一个key被保存了多个value，这种错误是在addEntry函数产生的，但这和线程死循环没有关系。</p>
<p>  接下来就分析transfer中那个while循环了。先所说这个循环正常的功能：src[j]保存的是映射成同一个hash值的多个Entry的链表，这个src[j]可能为null，可能只有一个Entry，也可能由多个Entry链接起来。假设是多个Entry，原来的链是(src[j]=a)->b（也就是src[j]=a,a.next=b,b.next=null），经过while处理后得到了(newTable[i]=b)->a。也就是说，把链表的next关系反向了。</p>
<p>  再看看这个while中可能在多线程情况下引起问题的语句。针对两个线程t1和t2,这里它们可能的产生问题的执行序列做些个人分析：</p>
<p>  1）假设同一个Entry列表[e->f->...]，t1先到，t2后到并都走到while中。t1执行“e.next = newTable[i];newTable[i] = e;”这使得e.next=null（初始的newTable[i]为null），newTable[i]指向了e。这时t2执行了“e.next = newTable[i];newTable[i] = e;”，这使得e.next=e，e死循环了。因为循环开始处的“final Entry<K,V> next = e.next;”，尽管e自己死循环了，在最后的“e = next;”后，两个线程都会跳过e继续执行下去。</p>
<p>  2）在while中逐个遍历Entry链表中的Entry而把next关系反向时，newTable[i]成为了被交换的引用，可疑的语句在于“e.next = newTable[i];”。假设链表e->f->g被t1处理成e<-f<-g，newTable[i]指向了g，这时t2进来了，它一执行“e.next = newTable[i];”就使得e->g，造成了死循环。所以，理论上来说，死循环的Entry个数可能很多。尽管产生了死循环，但是t1执行到了死循环的右边，所以是会继续执行下去的，而t2如果执行“final Entry<K,V> next = e.next;”的next为null，则也会继续执行下去，否则就进入了死循环。</p>
<p>  3）似乎情况会更复杂，因为即便线程跳出了死循环，它下一次做resize进入transfer时，有可能因为之前的死循环Entry链表而被hang住（似乎是一定会被hang住）。也有可能，在put检查Entry链表时（@标记1），因为Entry链表的死循环而被hang住。也似乎有可能，活着的线程和死循环的线程同时执行在while里后，两个线程都能活着出去。所以，可能两个线程平安退出，可能一个线程hang在transfer中，可能两个线程都被hang住而又不一定在一个地方。</p>
<p>  4）我反复的测试，出现一个线程被hang住的情况最多，都是e=e.next.next造成的，这主要就是例子put两份增量数据造成的。我如果去掉@标记3的输出，有时也能复现两个线程都被hang住的情况，但加上后就很难复现出来。我又把put的数据改了下，比如让两个线程put范围不同的数据，就能复现出e=e.next，两个线程都被hang住的情况。</p>
<p>  上面罗哩罗嗦了很多，一开始我简单的分析后觉得似乎明白了怎么回事，可现在仔细琢磨后似乎又不明白了许多。有一个细节是，每次死循环的key的大小也是有据可循的，我就不打哈了。感觉，如果样本多些，可能出现问题的原因点会很多，也会更复杂，我姑且不再蛋疼下去。至于有人提到ConcurrentHashMap也有这个问题，我觉得不大可能，因为它的put操作是加锁的，如果有这个问题就不叫线程安全的Map了。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/08/286.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>PHP中的“syntax error, unexpected T_PAAMAYIM_NEKUDOTAYIM”错误</title>
		<link>http://www.kafka0102.com/2010/08/281.html</link>
		<comments>http://www.kafka0102.com/2010/08/281.html#comments</comments>
		<pubDate>Fri, 06 Aug 2010 17:20:58 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[php]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=281</guid>
		<description><![CDATA[因为需要，今天晚些在本机使用PHP做些测试，PHP脚本依赖了一堆我也不清楚做什么用的库。结果一跑起来，就报出类似下面的错误：“Parse error: syntax error, unexpected T_PAAMAYIM_NEKUDOTAYIM in /home/kafka/test/test.php on line 8”。查找代码，发现报错的代码类似：“$class_name::func1();”，也就是使用一个表示类名的字符串变量来调用它的静态方法，并且是解析时的语法错误（我第一眼看到::时，脑子里浮现的是C++里的作用域符号，好长时间后才想起PHP里有::也有这种东西，我也是用过 self::doSomething()的）。]]></description>
			<content:encoded><![CDATA[<p>因为需要，今天晚些在本机使用PHP做些测试，PHP脚本依赖了一堆我也不清楚做什么用的库。结果一跑起来，就报出类似下面的错误：“Parse error: syntax error, unexpected T_PAAMAYIM_NEKUDOTAYIM in /home/kafka/test/test.php on line 8”。查找代码，发现报错的代码类似：“$class_name::func1();”，也就是使用一个表示类名的字符串变量来调用它的静态方法，并且是解析时的语法错误（我第一眼看到::时，脑子里浮现的是C++里的作用域符号，好长时间后才想起PHP里::也有这种东西，我也是用过self::doSomething()的）。这代码在测试机和生产机跑着呢，应该不会有问题。就到测试机测试了一下，果然没问题。对比PHP的版本，测试机的是最新的5.3.3,而我的是5.2.13。原因估计就是版本或者配置方面的不同造成了。于是google之，我勒个去，结果一堆页面都是在热火朝天的讨论PAAMAYIM_NEKUDOTAYIM这个怪异的词组什么意思，看得多了，我也明白了，Paamayim Nekudotayim是希伯来语，表示双冒号的意思，也就是double-colon，但却没看到有人提提这个error如何解决。后来总算在官网 http://www.php.net/manual/en/language.oop5.paamayim-nekudotayim.php 找到答案，原来“$class_name::func1();”这种用法是5.3以后才支持的。我晕！重新安装了最新的PHP后，程序正常了。好吧，与时俱进很重要。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/08/281.html/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>[Solr实践]Solr Cache使用介绍及分析</title>
		<link>http://www.kafka0102.com/2010/08/267.html</link>
		<comments>http://www.kafka0102.com/2010/08/267.html#comments</comments>
		<pubDate>Sat, 31 Jul 2010 21:15:27 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[solr]]></category>
		<category><![CDATA[solr cache]]></category>
		<category><![CDATA[solr 实践]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=267</guid>
		<description><![CDATA[	本文将介绍Solr查询中涉及到的Cache使用及相关的实现。Solr查询的核心类就是SolrIndexSearcher，每个core通常在同一时刻只由当前的SolrIndexSearcher供上层的handler使用（当切换SolrIndexSearcher时可能会有两个同时提供服务），而Solr的各种Cache是依附于SolrIndexSearcher的，SolrIndexSearcher在则Cache生，SolrIndexSearcher亡则Cache被清空close掉。Solr中的应用Cache有filterCache、queryResultCache、documentCache等，这些Cache都是SolrCache的实现类，并且是SolrIndexSearcher的成员变量，各自有着不同的逻辑和使命，下面分别予以介绍和分析。]]></description>
			<content:encoded><![CDATA[<p>	本文将介绍Solr查询中涉及到的Cache使用及相关的实现。Solr查询的核心类就是SolrIndexSearcher，每个core通常在同一时刻只由当前的SolrIndexSearcher供上层的handler使用（当切换SolrIndexSearcher时可能会有两个同时提供服务），而Solr的各种Cache是依附于SolrIndexSearcher的，SolrIndexSearcher在则Cache生，SolrIndexSearcher亡则Cache被清空close掉。Solr中的应用Cache有filterCache、queryResultCache、documentCache等，这些Cache都是SolrCache的实现类，并且是SolrIndexSearcher的成员变量，各自有着不同的逻辑和使命，下面分别予以介绍和分析。</p>
<h2>1、SolrCache接口实现类</h2>
<p>	Solr提供了两种SolrCache接口实现类：solr.search.LRUCache和solr.search.FastLRUCache。FastLRUCache是1.4版本中引入的，其速度在普遍意义上要比LRUCache更fast些。<br />
	下面是对SolrCache接口主要方法的注释：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">interface</span> SolrCache <span style="color: #009900;">&#123;</span>
  <span style="color: #008000; font-style: italic; font-weight: bold;">/**
   * Solr在解析配置文件构造SolrConfig实例时会初始化配置中的各种CacheConfig，
   * 在构造SolrIndexSearcher时通过SolrConfig实例来newInstance SolrCache，
   * 这会调用init方法。参数args就是和具体实现（LRUCache和FastLRUCache）相关的
   * 参数Map，参数persistence是个全局的东西，LRUCache和FastLRUCache用其来统计
   * cache访问情况（因为cache是和SolrIndexSearcher绑定的，所以这种统计就需要个
   * 全局的注入参数），参数regenerator是autowarm时如何重新加载cache，
   * CacheRegenerator接口只有一个被SolrCache warm方法回调的方法：
   * boolean regenerateItem(SolrIndexSearcher newSearcher,
   * SolrCache newCache, SolrCache oldCache, Object oldKey, Object oldVal)
   */</span>
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">Object</span> init<span style="color: #009900;">&#40;</span><span style="color: #003399;">Map</span> args, <span style="color: #003399;">Object</span> persistence, CacheRegenerator regenerator<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #008000; font-style: italic; font-weight: bold;">/** :TODO: copy from Map */</span>
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">int</span> size<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #008000; font-style: italic; font-weight: bold;">/** :TODO: copy from Map */</span>
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">Object</span> put<span style="color: #009900;">&#40;</span><span style="color: #003399;">Object</span> key, <span style="color: #003399;">Object</span> value<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #008000; font-style: italic; font-weight: bold;">/** :TODO: copy from Map */</span>
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">Object</span> get<span style="color: #009900;">&#40;</span><span style="color: #003399;">Object</span> key<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #008000; font-style: italic; font-weight: bold;">/** :TODO: copy from Map */</span>
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> clear<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #008000; font-style: italic; font-weight: bold;">/**
   * 新创建的SolrIndexSearcher autowarm方法，该方法的实现就是遍历已有cache中合适的
   * 范围（因为通常不会把旧cache中的所有项都重新加载一遍），对每一项调用regenerator的
   * regenerateItem方法来对searcher加载新cache项。
   */</span>
  <span style="color: #000066; font-weight: bold;">void</span> warm<span style="color: #009900;">&#40;</span>SolrIndexSearcher searcher, SolrCache old<span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">throws</span> <span style="color: #003399;">IOException</span><span style="color: #339933;">;</span>
  <span style="color: #008000; font-style: italic; font-weight: bold;">/** Frees any non-memory resources */</span>
  <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> close<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<h3>1.1、solr.search.LRUCache </h3>
<p>LRUCache可配置参数如下：<br />
1）size：cache中可保存的最大的项数，默认是1024<br />
2）initialSize：cache初始化时的大小，默认是1024。<br />
3）autowarmCount：当切换SolrIndexSearcher时，可以对新生成的SolrIndexSearcher做autowarm（预热）处理。autowarmCount表示从旧的SolrIndexSearcher中取多少项来在新的SolrIndexSearcher中被重新生成，如何重新生成由CacheRegenerator实现。在当前的1.4版本的Solr中，这个autowarmCount只能取预热的项数，将来的4.0版本可以指定为已有cache项数的百分比，以便能更好的平衡autowarm的开销及效果。如果不指定该参数，则表示不做autowarm处理。<br />
	实现上，LRUCache直接使用LinkedHashMap来缓存数据，由initialSize来限定cache的大小，淘汰策略也是使用LinkedHashMap的内置的LRU方式，读写操作都是对map的全局锁，所以并发性效果方面稍差。</p>
<h3>1.2、solr.search.FastLRUCache</h3>
<p>在配置方面，FastLRUCache除了需要LRUCache的参数，还可有选择性的指定下面的参数：<br />
1）minSize：当cache达到它的最大数，淘汰策略使其降到minSize大小，默认是0.9*size。<br />
2）acceptableSize：当淘汰数据时，期望能降到minSize，但可能会做不到，则可勉为其难的降到acceptableSize，默认是0.95*size。<br />
3）cleanupThread：相比LRUCache是在put操作中同步进行淘汰工作，FastLRUCache可选择由独立的线程来做，也就是配置cleanupThread的时候。当cache大小很大时，每一次的淘汰数据就可能会花费较长时间，这对于提供查询请求的线程来说就不太合适，由独立的后台线程来做就很有必要。<br />
	实现上，FastLRUCache内部使用了ConcurrentLRUCache来缓存数据，它是个加了LRU淘汰策略的ConcurrentHashMap，所以其并发性要好很多，这也是多数Java版Cache的极典型实现。</p>
<h2>2、filterCache</h2>
<p>	filterCache存储了无序的lucene document id集合，该cache有3种用途：<br />
	1）filterCache存储了filter queries(&#8220;fq&#8221;参数)得到的document id集合结果。Solr中的query参数有两种，即q和fq。如果fq存在，Solr是先查询fq（因为fq可以多个，所以多个fq查询是个取结果交集的过程），之后将fq结果和q结果取并。在这一过程中，filterCache就是key为单个fq（类型为Query），value为document id集合（类型为DocSet）的cache。对于fq为range query来说，filterCache表现出其有价值的一面。<br />
	2）filterCache还可用于facet查询（http://wiki.apache.org/solr/SolrFacetingOverview），facet查询中各facet的计数是通过对满足query条件的document id集合（可涉及到filterCache）的处理得到的。因为统计各facet计数可能会涉及到所有的doc id，所以filterCache的大小需要能容下索引的文档数。<br />
	3）如果solfconfig.xml中配置了&lt;useFilterForSortedQuery/&gt;，那么如果查询有filter（此filter是一需要过滤的DocSet，而不是fq，我未见得它有什么用），则使用filterCache。<br />
下面是filterCache的配置示例：</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;">    <span style="color: #808080; font-style: italic;">&lt;!-- Internal cache used by SolrIndexSearcher for filters (DocSets),</span>
<span style="color: #808080; font-style: italic;">         unordered sets of *all* documents that match a query.</span>
<span style="color: #808080; font-style: italic;">         When a new searcher is opened, its caches may be prepopulated</span>
<span style="color: #808080; font-style: italic;">         or &quot;autowarmed&quot; using data from caches in the old searcher.</span>
<span style="color: #808080; font-style: italic;">         autowarmCount is the number of items to prepopulate.  For LRUCache,</span>
<span style="color: #808080; font-style: italic;">         the prepopulated items will be the most recently accessed items.</span>
<span style="color: #808080; font-style: italic;">      --&gt;</span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;filterCache</span></span>
<span style="color: #009900;">      <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;solr.LRUCache&quot;</span></span>
<span style="color: #009900;">      <span style="color: #000066;">size</span>=<span style="color: #ff0000;">&quot;16384&quot;</span></span>
<span style="color: #009900;">      <span style="color: #000066;">initialSize</span>=<span style="color: #ff0000;">&quot;4096&quot;</span></span>
<span style="color: #009900;">      <span style="color: #000066;">autowarmCount</span>=<span style="color: #ff0000;">&quot;4096&quot;</span><span style="color: #000000; font-weight: bold;">/&gt;</span></span></pre></div></div>

<p>      对于是否使用filterCache及如何配置filterCache大小，需要根据应用特点、统计、效果、经验等各方面来评估。对于使用fq、facet的应用，对filterCache的调优是很有必要的。</p>
<h2>3、queryResultCache</h2>
<p>	顾名思义，queryResultCache是对查询结果的缓存（SolrIndexSearcher中的cache缓存的都是document id set），这个结果就是针对查询条件的完全有序的结果。下面是它的配置示例：</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;">	    <span style="color: #808080; font-style: italic;">&lt;!-- queryResultCache caches results of searches - ordered lists of</span>
<span style="color: #808080; font-style: italic;">         document ids (DocList) based on a query, a sort, and the range</span>
<span style="color: #808080; font-style: italic;">         of documents requested.</span>
<span style="color: #808080; font-style: italic;">      --&gt;</span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;queryResultCache</span></span>
<span style="color: #009900;">      <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;solr.LRUCache&quot;</span></span>
<span style="color: #009900;">      <span style="color: #000066;">size</span>=<span style="color: #ff0000;">&quot;16384&quot;</span></span>
<span style="color: #009900;">      <span style="color: #000066;">initialSize</span>=<span style="color: #ff0000;">&quot;4096&quot;</span></span>
<span style="color: #009900;">      <span style="color: #000066;">autowarmCount</span>=<span style="color: #ff0000;">&quot;1024&quot;</span><span style="color: #000000; font-weight: bold;">/&gt;</span></span></pre></div></div>

<p>      缓存的key是个什么结构呢？就是下面的类（key的hashcode就是QueryResultKey的成员变量hc）：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> QueryResultKey<span style="color: #009900;">&#40;</span>Query query, List<span style="color: #339933;">&lt;</span>Query<span style="color: #339933;">&gt;</span> filters, Sort sort, <span style="color: #000066; font-weight: bold;">int</span> nc_flags<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">query</span> <span style="color: #339933;">=</span> query<span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">sort</span> <span style="color: #339933;">=</span> sort<span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">filters</span> <span style="color: #339933;">=</span> filters<span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">nc_flags</span> <span style="color: #339933;">=</span> nc_flags<span style="color: #339933;">;</span>
    <span style="color: #000066; font-weight: bold;">int</span> h <span style="color: #339933;">=</span> query.<span style="color: #006633;">hashCode</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>filters <span style="color: #339933;">!=</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> h <span style="color: #339933;">^=</span> filters.<span style="color: #006633;">hashCode</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    sfields <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">sort</span> <span style="color: #339933;">!=</span><span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">?</span> <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">sort</span>.<span style="color: #006633;">getSort</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">:</span> defaultSort<span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span>SortField sf <span style="color: #339933;">:</span> sfields<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #666666; font-style: italic;">// mix the bits so that sortFields are position dependent</span>
      <span style="color: #666666; font-style: italic;">// so that a,b won't hash to the same value as b,a</span>
      h <span style="color: #339933;">^=</span> <span style="color: #009900;">&#40;</span>h <span style="color: #339933;">&lt;&lt;</span> <span style="color: #cc66cc;">8</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">|</span> <span style="color: #009900;">&#40;</span>h <span style="color: #339933;">&gt;&gt;&gt;</span> <span style="color: #cc66cc;">25</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>   <span style="color: #666666; font-style: italic;">// reversible hash</span>
      <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>sf.<span style="color: #006633;">getField</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">!=</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> h <span style="color: #339933;">+=</span> sf.<span style="color: #006633;">getField</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">hashCode</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      h <span style="color: #339933;">+=</span> sf.<span style="color: #006633;">getType</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>sf.<span style="color: #006633;">getReverse</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> h<span style="color: #339933;">=</span>~h<span style="color: #339933;">;</span>
      <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>sf.<span style="color: #006633;">getLocale</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">!=</span><span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> h<span style="color: #339933;">+=</span>sf.<span style="color: #006633;">getLocale</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">hashCode</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
      <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>sf.<span style="color: #006633;">getFactory</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">!=</span><span style="color: #000066; font-weight: bold;">null</span><span style="color: #009900;">&#41;</span> h<span style="color: #339933;">+=</span>sf.<span style="color: #006633;">getFactory</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">hashCode</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
    hc <span style="color: #339933;">=</span> h<span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span></pre></div></div>

<p>  因为查询参数是有start和rows的，所以某个QueryResultKey可能命中了cache，但start和rows却不在cache的document id set范围内。当然，document id set是越大命中的概率越大，但这也会很浪费内存，这就需要个参数：queryResultWindowSize来指定document id set的大小。Solr中默认取值为50,可配置，WIKI上的解释很深简单明了：</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;">      <span style="color: #808080; font-style: italic;">&lt;!-- An optimization for use with the queryResultCache.  When a search</span>
<span style="color: #808080; font-style: italic;">         is requested, a superset of the requested number of document ids</span>
<span style="color: #808080; font-style: italic;">         are collected.  For example, of a search for a particular query</span>
<span style="color: #808080; font-style: italic;">         requests matching documents 10 through 19, and queryWindowSize is 50,</span>
<span style="color: #808080; font-style: italic;">         then documents 0 through 50 will be collected and cached.  Any further</span>
<span style="color: #808080; font-style: italic;">         requests in that range can be satisfied via the cache.</span>
<span style="color: #808080; font-style: italic;">    --&gt;</span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;queryResultWindowSize<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>50<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/queryResultWindowSize<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>  相比filterCache来说，queryResultCache内存使用上要更少一些，但它的效果如何就很难说。就索引数据来说，通常我们只是在索引上存储应用主键id，再从数据库等数据源获取其他需要的字段。这使得查询过程变成，首先通过solr得到document id set，再由Solr得到应用id集合，最后从外部数据源得到完成的查询结果。如果对查询结果正确性没有苛刻的要求，可以在Solr之外独立的缓存完整的查询结果（定时作废），这时queryResultCache就不是很有必要，否则可以考虑使用queryResultCache。当然，如果发现在queryResultCache生命周期内，query重合度很低，也不是很有必要开着它。</p>
<h2>4、documentCache</h2>
<p>	又顾名思义，documentCache用来保存&lt;doc_id,document&gt;对的。如果使用documentCache，就尽可能开大些，至少要大过&lt;max_results&gt; * &lt;max_concurrent_queries&gt;，否则因为cache的淘汰，一次请求期间还需要重新获取document一次。也要注意document中存储的字段的多少，避免大量的内存消耗。<br />
	下面是documentCache的配置示例：</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;">	<span style="color: #808080; font-style: italic;">&lt;!-- documentCache caches Lucene Document objects (the stored fields for each document).</span>
<span style="color: #808080; font-style: italic;">      --&gt;</span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;documentCache</span></span>
<span style="color: #009900;">      <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;solr.LRUCache&quot;</span></span>
<span style="color: #009900;">      <span style="color: #000066;">size</span>=<span style="color: #ff0000;">&quot;16384&quot;</span></span>
<span style="color: #009900;">      <span style="color: #000066;">initialSize</span>=<span style="color: #ff0000;">&quot;16384&quot;</span><span style="color: #000000; font-weight: bold;">/&gt;</span></span></pre></div></div>

<h2>5、User/Generic Caches</h2>
<p>Solr支持自定义Cache，只需要实现自定义的regenerator即可，下面是配置示例：</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;">    <span style="color: #808080; font-style: italic;">&lt;!-- Example of a generic cache.  These caches may be accessed by name</span>
<span style="color: #808080; font-style: italic;">         through SolrIndexSearcher.getCache(),cacheLookup(), and cacheInsert().</span>
<span style="color: #808080; font-style: italic;">         The purpose is to enable easy caching of user/application level data.</span>
<span style="color: #808080; font-style: italic;">         The regenerator argument should be specified as an implementation</span>
<span style="color: #808080; font-style: italic;">         of solr.search.CacheRegenerator if autowarming is desired.</span>
<span style="color: #808080; font-style: italic;">    --&gt;</span>
    <span style="color: #808080; font-style: italic;">&lt;!--</span>
<span style="color: #808080; font-style: italic;">    &lt;cache name=&quot;yourCacheNameHere&quot;</span>
<span style="color: #808080; font-style: italic;">      class=&quot;solr.LRUCache&quot;</span>
<span style="color: #808080; font-style: italic;">      size=&quot;4096&quot;</span>
<span style="color: #808080; font-style: italic;">      initialSize=&quot;2048&quot;</span>
<span style="color: #808080; font-style: italic;">      autowarmCount=&quot;4096&quot;</span>
<span style="color: #808080; font-style: italic;">      regenerator=&quot;org.foo.bar.YourRegenerator&quot;/&gt;</span>
<span style="color: #808080; font-style: italic;">    --&gt;</span></pre></div></div>

<h2>6、The Lucene FieldCache</h2>
<p>	lucene中有相对低级别的FieldCache，Solr并不对它做管理，所以，lucene的FieldCache还是由lucene的IndexSearcher来搞。</p>
<h2>7、autowarm</h2>
<p>	上面有提到autowarm，autowarm触发的时机有两个，一个是创建第一个Searcher时（firstSearcher），一个是创建个新Searcher（newSearcher）来代替当前的Searcher。在Searcher提供请求服务前，Searcher中的各个Cache可以做warm处理，处理的地方通常是SolrCache的init方法，而不同cache的warm策略也不一样。<br />
	1）filterCache：filterCache注册了下面的CacheRegenerator，就是由旧的key查询索引得到新值put到新cache中。</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">	      solrConfig.<span style="color: #006633;">filterCacheConfig</span>.<span style="color: #006633;">setRegenerator</span><span style="color: #009900;">&#40;</span>
              <span style="color: #000000; font-weight: bold;">new</span> CacheRegenerator<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">boolean</span> regenerateItem<span style="color: #009900;">&#40;</span>SolrIndexSearcher newSearcher, SolrCache newCache, SolrCache oldCache, <span style="color: #003399;">Object</span> oldKey, <span style="color: #003399;">Object</span> oldVal<span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">throws</span> <span style="color: #003399;">IOException</span> <span style="color: #009900;">&#123;</span>
                  newSearcher.<span style="color: #006633;">cacheDocSet</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>Query<span style="color: #009900;">&#41;</span>oldKey, <span style="color: #000066; font-weight: bold;">null</span>, <span style="color: #000066; font-weight: bold;">false</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
                  <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000066; font-weight: bold;">true</span><span style="color: #339933;">;</span>
                <span style="color: #009900;">&#125;</span>
              <span style="color: #009900;">&#125;</span>
      <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>      2）queryResultCache：queryResultCache的autowarm不在SolrCache的init（也就是说，不是去遍历已有的queryResultCache中的query key执行查询），而是通过SolrEventListener接口的void newSearcher(SolrIndexSearcher newSearcher, SolrIndexSearcher currentSearcher)方法，来执行配置中特定的query查询，达到显示的预热lucene FieldCache的效果。<br />
      queryResultCache的配置示例如下：</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;">          <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;listener</span> <span style="color: #000066;">event</span>=<span style="color: #ff0000;">&quot;newSearcher&quot;</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;solr.QuerySenderListener&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;arr</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;queries&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
        <span style="color: #808080; font-style: italic;">&lt;!-- seed common sort fields --&gt;</span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;lst<span style="color: #000000; font-weight: bold;">&gt;</span></span></span> <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;q&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>anything<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span> <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;sort&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>name desc price desc populartiy desc<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span> <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/lst<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/arr<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/listener<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;listener</span> <span style="color: #000066;">event</span>=<span style="color: #ff0000;">&quot;firstSearcher&quot;</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;solr.QuerySenderListener&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;arr</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;queries&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
        <span style="color: #808080; font-style: italic;">&lt;!-- seed common sort fields --&gt;</span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;lst<span style="color: #000000; font-weight: bold;">&gt;</span></span></span> <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;q&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>anything<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span> <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;sort&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>name desc, price desc, populartiy desc<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span> <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/lst<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #808080; font-style: italic;">&lt;!-- seed common facets and filter queries --&gt;</span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;lst<span style="color: #000000; font-weight: bold;">&gt;</span></span></span> <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;q&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>anything<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span> 
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;facet.field&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>category<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span> 
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;fq&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>inStock:true<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;fq&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>price:[0 TO 100]<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/lst<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/arr<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/listener<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>    3）documentCache：因为新索引的document id和索引文档的对应关系发生变化，所以documentCache没有warm的过程，落得白茫茫一片真干净。<br />
    尽管autowarm很好，也要注意autowarm带来的开销，这需要在实际中检验其warm的开销，也要注意Searcher的切换频率，避免因为warm和切换影响Searcher提供正常的查询服务。</p>
<h2>8、参考文章</h2>
<p>http://wiki.apache.org/solr/SolrCaching</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/08/267.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>使用JRebel提供对Java web开发的热部署</title>
		<link>http://www.kafka0102.com/2010/07/258.html</link>
		<comments>http://www.kafka0102.com/2010/07/258.html#comments</comments>
		<pubDate>Thu, 29 Jul 2010 13:32:08 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[jrebel]]></category>
		<category><![CDATA[热部署]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=258</guid>
		<description><![CDATA[	这几天在写Java Web页面，开发环境是linux+eclipse+maven+jetty。开发java web最烦的就是改个文件需要重启web server，尽管现在的web server（比如小野猫）支持了热部署，不过其实现相当于重启了web server，如果文件多些初始化复杂些，重启的时间也够受的。对于开发的IDE来说，myeclipse是个不错的选择，它能对修改的文件自动部署到web server（eclipse wtp就没做这个支持，但我们也可以投机的对部署目录和开发目录做个软链），不多我试用了其最新的8.5版本，在本本上响应速度有些迟缓，影响编码情绪。而且，因为环境需要，最后开发环境定位eclipse+maven+jetty（maven提供了jetty的plugin用于开发测试），并且找到了JRebel这个强悍的能提供对Web server的热部署到工具，它不像web server那样需要重启服务，而是动态的加载修改的文件，所以反应速度上要好很多，它除了可以热加载class、jsp文件，也可以是spring、hibernate等配置文件。]]></description>
			<content:encoded><![CDATA[<p>这几天在写Java Web页面，开发环境是linux+eclipse+maven+jetty。开发java web最烦的就是改个文件需要重启web server，尽管现在的web server（比如小野猫）支持了热部署，不过其实现相当于重启了web server，如果文件多些初始化复杂些，重启的时间也够受的。对于开发的IDE来说，myeclipse是个不错的选择，它能对修改的文件自动部署到web server（eclipse wtp就没做这个支持，但我们也可以投机的对部署目录和开发目录做个软链），不过我试用了其最新的8.5版本，在本本上响应速度有些迟缓，影响编码情绪。而且，因为环境需要，最后开发环境定位eclipse+maven+jetty（maven提供了jetty的plugin用于开发测试），并且找到了JRebel这个强悍的能提供对Web server的热部署的工具，它不像web server那样需要重启服务，而是动态的加载修改的文件，所以反应速度上要好很多，它除了可以热加载class、jsp文件，也可以是spring、hibernate等配置文件。</p>
<p>可以从 <a href="http://www.zeroturnaround.com/jrebel/" target="blank">jrebel</a> 得到JRebel，它不是个免费软件，但有30天的试用期，所以我先混个月再说。如果你使用eclipse wtp，可以参考<a href="http://www.zeroturnaround.com/jrebel/eclipse-jrebel-tutorial/" target="blank">eclipse-jrebel-tutorial</a>来安装使用之。下面列出linux+eclipse+maven+jetty环境中的安装步骤（我没有在其网站上找到完整的安装步骤）：</p>
<p>1、从<a href="http://www.zeroturnaround.com/jrebel/current/" target="blank">http://www.zeroturnaround.com/jrebel/current/</a>下载最新的JRebel安装包，根据readme.txt中的说明安装JRebel，我的安装路径是：/home/kafka/tools/jrebel。</p>
<p>2、修改项目的pom.xml文件，将已有的jetty plugin配置修改成：</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;">	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;plugin<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
		<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;groupId<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>org.mortbay.jetty<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/groupId<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
		<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;artifactId<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>maven-jetty-plugin<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/artifactId<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
		<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;configuration<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
					<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;scanIntervalSeconds<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>0<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/scanIntervalSeconds<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
		<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/configuration<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/plugin<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>在&lt;build&gt;&lt;plugins&gt;中加入：</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;">	 <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;plugin<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
		<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;groupId<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>org.zeroturnaround<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/groupId<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
		<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;artifactId<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>javarebel-maven-plugin<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/artifactId<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
		<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;executions<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
			<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;execution<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
			 <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;id<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>generate-rebel-xml<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/id<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
			<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;phase<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>process-resources<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/phase<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
			<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;goals<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
		         	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;goal<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>generate<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/goal<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
			<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/goals<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
			<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/execution<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
		<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/executions<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/plugin<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>然后运行mvn javarebel:generate。可以在target/classes路径中看到新生成的rebel.xml。也可以参考http://www.zeroturnaround.com/jrebel/configuration/maven/做更多了解。</p>
<p>3、修改环境变量MAVEN_OPTS。我这是个人单机，所以直接在.bashrc中配置：export MAVEN_OPTS=&#8221;-noverify -javaagent:/home/kafka/tools/JRebel/jrebel.jar&#8221;，javaagent指定的就是刚才jrebel.jar安装的路径。</p>
<p>4、好了，可以收工了。运行mvn jetty:run，在启动信息中可以看到jrebel相关的信息。后续对文件的修改也可以在console上看到jrebel reloading相应文件的信息。</p>
<p>浏览JRebel网站，发现几篇很好的文章 http://www.zeroturnaround.com/blog/reloading-objects-classes-classloaders/ ，是关于JRebel的实现原理、Classloader及Web server热部署的实现原理方面的。最近也在看terracotta dso，其和JRebel都用到了java instrument，有时间好好琢磨琢磨。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/07/258.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>[Solr源码分析]Solr复制类ReplicationHandler实现简要分析</title>
		<link>http://www.kafka0102.com/2010/07/249.html</link>
		<comments>http://www.kafka0102.com/2010/07/249.html#comments</comments>
		<pubDate>Sat, 24 Jul 2010 16:42:27 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[solr]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[源码分析]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=249</guid>
		<description><![CDATA[在上一文《solr ReplicationHandler使用介绍》的基础上，本文接着对solr的ReplicationHandler实现细节做些分析，这个分析原则上没有摘取大段代码，窃以为摘了代码后未见得有很好的阐述效果，但不摘取后窃又发现，阐述的效果依旧不好。归结起来，还是窃的表达不够深入浅出所致。闲言少叙，直接上内容。]]></description>
			<content:encoded><![CDATA[<p>在上一文《<a title="Permanent Link to solr ReplicationHandler使用介绍" rel="bookmark" href="http://www.kafka0102.com/2010/07/244.html">solr  ReplicationHandler使用介绍</a>》的基础上，本文接着对solr的ReplicationHandler实现细节做些分析，这个分析原则上没有摘取大段代码，窃以为摘了代码后未见得有很好的阐述效果，但不摘取后窃又发现，阐述的效果依旧不好。归结起来，还是窃的表达不够深入浅出所致。闲言少叙，直接上内容。</p>
<h2>1、master的工作</h2>
<p>对于ReplicationHandler的复制功能来说，核心的问题确定是在一个时间点要复制哪些文件，这就用上了lucene的IndexDeletionPolicy的特性。lucene在初始化时，会调用IndexDeletionPolicy.onInit(List&lt;? extends IndexCommit&gt; commits)方法；lucene在commit（触发的时机也可以是optimize、close，solr在commit时实际上就是close了indexwriter）时，会调用IndexDeletionPolicy.onCommit(List&lt;? extends IndexCommit&gt; commits)。IndexCommit对象中保存了该次提交关联的文件列表等信息，这使得solr中的复制过程中，slave可以从master得到文件列表后跟本地文件做比较，跳过不变的文件，下载新文件，并删除无用的文件。IndexDeletionPolicy的两个针对commits的函数，会对当前存在的commits列表做些处理，比如lucene默认的KeepOnlyLastCommitDeletionPolicy会只保留最新的IndexCommit，对那些过时的IndexCommit执行delete操作以将无用的文件删掉。solr中，SolrDeletionPolicy默认也是保留最新一个IndexCommit，但可以设置maxCommitAge、maxCommitsToKeep、maxOptimizedCommitsToKeep来保留更多的IndexCommit。但solr真正使用的IndexDeletionPolicy实现是IndexDeletionPolicyWrapper，它是SolrDeletionPolicy的wrap。在slave从master复制文件的过程中，要保证当前正在复制的IndexCommit点不能被删除，这就用到了IndexDeletionPolicyWrapper中的void setReserveDuration(Long indexVersion, long reserveTime)方法，该方法会在master向slave响应indexversion、filelist命令前、以及每向slave传送5M的索引文件内容时调用，而默认的reserveTime时间是10s，如果慢速网络传输5M数据需要10秒以上，就需要调整该值了。</p>
<p>ReplicationHandler复制文件没有采用rsync，而是使用http，它在读一个文件内容传输到slave时，默认是按照1M大小分段输出内容到slave（http chunked？），并且默认是对每段内容做了checksum，保证传输的内容的正确性。上面提到的setReserveDuration点，主要就是它在packetsWritten % 5 == 0次数后触发一次修改。</p>
<p>ReplicationHandler还可以备份索引文件。由于lucene的索引文件只是追加新文件而不会修改已有文件，所以只要针对一个IndexCommit点做备份，其过程还是很简单的。</p>
<h2>2、slave的工作</h2>
<p>slave启动时会创建SnapPuller对象，SnapPuller会启动一个线程定时的（pollInterval间隔）从master复制数据（fetchLatestIndex方法）。对于一次复制过程，slave和master交互处理细节如下：<br />
1、slave首先向master询问最新的索引版本号（indexversion命令），slave检查得到的latestVersion、latestGeneration有效后，和本地的IndexCommit的getVersion()、getGeneration()比较，如果不相等，则需要往下进行，否则等待下一次调度。</p>
<p>2、slave向master请求之前得到的indexversion下的文件列表（filelist命令，包括索引文件和可选的配置文件）。如果文件列表为空，则返回等待下一次调度。否则，就需要检查哪些文件需要被下载过来。这里做的判断有：1）如果本地的commit.getGeneration() &gt;= latestGeneration，说明本地索引文件被破坏（比如对slave不小心提交了修改索引的命令），需要完全将master的文件复制过来。2）逐个检查文件列表中的文件是否在本地存在，不存在就下载下来。</p>
<p>3、对于下载文件内容，对应命令是filecontent。下载的文件显然需要放到临时目录中，这个临时目录和已有的索引目录（默认名字index）在同一数据目录下，只是命名为index.&lt;时间戳&gt;。下载完毕后，copy数据有两种情况：1）如果是完全下载，则不需要将临时目录中的文件copy到已有目录中，而是修改数据目录中的index.properties，标识索引目录为新生成的临时目录，而旧索引目录并不会被删除，可以手工删掉，当然，通常是不应该出现slave的Generation大于master的异常情况。2）通常就是把临时索引目录的文件copy到旧索引目录，copy时要把segments_N放到最后copy，避免copy中途出现异常造成数据被毁。</p>
<p>4、当新索引和可选的配置文件copy完毕之后，slave会对solrcore的UpdateHandler做commit操作，这会close掉indexwriter并强制重启新的indexsearcher提供服务。同时，如果solrcore的UpdateHandler是DirectUpdateHandler2（不应该不是），会强制调用handler.forceOpenWriter()来删除旧的无用的索引文件，并调用replicationHandler.refreshCommitpoint()来更新slave的indexCommitPoint。</p>
<p>5、如果索引复制失败，slave会向数据目录下的replication.properties输出复制失败的信息。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/07/249.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>[Solr实践]Solr复制类ReplicationHandler使用介绍</title>
		<link>http://www.kafka0102.com/2010/07/244.html</link>
		<comments>http://www.kafka0102.com/2010/07/244.html#comments</comments>
		<pubDate>Sat, 24 Jul 2010 14:30:51 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[solr]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[ReplicationHandler]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=244</guid>
		<description><![CDATA[solr1.4中引入ReplicationHandler代替外部脚本来复制索引数据，ReplicationHandler使得复制索引数据更自动化。对于使用者来说，只要简单的配置好，就可以一劳永逸的享受solr的复制功能了。下面介绍其使用相关内容。]]></description>
			<content:encoded><![CDATA[<p>	solr1.4中引入ReplicationHandler代替外部脚本来复制索引数据，ReplicationHandler使得复制索引数据更自动化。对于使用者来说，只要简单的配置好，就可以一劳永逸的享受solr的复制功能了。下面介绍其使用相关内容。</p>
<h2>1、配置</h2>
<p>	ReplicationHandler是个RequestHandler，如果需要使用它，也就是在solrconfig.xml中配置它，下面介绍ReplicationHandler的配置参数。</p>
<h3>1.1、Master</h3>
<p>	master的配置示例如下：</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;requestHandler</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;/replication&quot;</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;solr.ReplicationHandler&quot;</span> <span style="color: #000000; font-weight: bold;">&gt;</span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;lst</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;master&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
        <span style="color: #808080; font-style: italic;">&lt;!--Replicate on 'startup' and 'commit'. 'optimize' is also a valid value for replicateAfter. --&gt;</span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;replicateAfter&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>startup<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;replicateAfter&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>commit<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
&nbsp;
        <span style="color: #808080; font-style: italic;">&lt;!--Create a backup after 'optimize'. Other values can be 'commit', 'startup'. It is possible to have multiple entries of this config string.  Note that this is just for backup, replication does not require this. --&gt;</span>
        <span style="color: #808080; font-style: italic;">&lt;!-- &lt;str name=&quot;backupAfter&quot;&gt;optimize&lt;/str&gt; --&gt;</span>
&nbsp;
        <span style="color: #808080; font-style: italic;">&lt;!--If configuration files need to be replicated give the names here, separated by comma --&gt;</span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;confFiles&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>schema.xml,stopwords.txt,elevate.xml<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
       <span style="color: #808080; font-style: italic;">&lt;!--The default value of reservation is 10 secs.See the documentation below . Normally , you should not need to specify this --&gt;</span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;commitReserveDuration&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>00:00:10<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/lst<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/requestHandler<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>	说明：<br />
	1)replicateAfter可取startup、commit、optimize，表示触发复制的时机。使用中，这三个值都可以配上。<br />
	2)backupAfter表示备份时机，如果需要备份，solr会在配置的时机自动生成备份。<br />
	3)confFiles表示在复制时需要复制到slave的文件列表。<br />
	4)commitReserveDuration默认是10秒，这个值通常你通常不需要修改，除非你的网络慢到传输5M数据需要10秒以上的时间。</p>
<h3>1.2、Slave</h3>
<p>	Slave的配置示例如下：</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;requestHandler</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;/replication&quot;</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;solr.ReplicationHandler&quot;</span> <span style="color: #000000; font-weight: bold;">&gt;</span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;lst</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;slave&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
&nbsp;
        <span style="color: #808080; font-style: italic;">&lt;!--fully qualified url for the replication handler of master . It is possible to pass on this as a request param for the fetchindex command--&gt;</span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;masterUrl&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>http://master_host:port/solr/corename/replication<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>  
&nbsp;
        <span style="color: #808080; font-style: italic;">&lt;!--Interval in which the slave should poll master .Format is HH:mm:ss . If this is absent slave does not poll automatically. </span>
<span style="color: #808080; font-style: italic;">         But a fetchindex can be triggered from the admin or the http API --&gt;</span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;pollInterval&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>00:00:20<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>  
        <span style="color: #808080; font-style: italic;">&lt;!-- THE FOLLOWING PARAMETERS ARE USUALLY NOT REQUIRED--&gt;</span>
        <span style="color: #808080; font-style: italic;">&lt;!--to use compression while transferring the index files. The possible values are internal|external</span>
<span style="color: #808080; font-style: italic;">         if the value is 'external' make sure that your master Solr has the settings to honour the accept-encoding header.</span>
<span style="color: #808080; font-style: italic;">         see here for details http://wiki.apache.org/solr/SolrHttpCompression</span>
<span style="color: #808080; font-style: italic;">         If it is 'internal' everything will be taken care of automatically. </span>
<span style="color: #808080; font-style: italic;">         USE THIS ONLY IF YOUR BANDWIDTH IS LOW . THIS CAN ACTUALLY SLOWDOWN REPLICATION IN A LAN--&gt;</span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;compression&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>internal<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #808080; font-style: italic;">&lt;!--The following values are used when the slave connects to the master to download the index files. </span>
<span style="color: #808080; font-style: italic;">         Default values implicitly set as 5000ms and 10000ms respectively. The user DOES NOT need to specify </span>
<span style="color: #808080; font-style: italic;">         these unless the bandwidth is extremely low or if there is an extremely high latency--&gt;</span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;httpConnTimeout&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>5000<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;httpReadTimeout&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>10000<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
&nbsp;
        <span style="color: #808080; font-style: italic;">&lt;!-- If HTTP Basic authentication is enabled on the master, then the slave can be configured with the following --&gt;</span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;httpBasicAuthUser&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>username<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;str</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;httpBasicAuthPassword&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>password<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/str<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
     <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/lst<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/requestHandler<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>	说明：上面的参数也不需要太多解释，其中pollInterval参数表明slave从master复制数据的频率。如果对实时性要求不高，通常5-10分钟即可，也避免slave的indexsearcher频繁的切换，同时，master的commit频率也可相对保持一致。</p>
<h2>2、HTTP API</h2>
<p>	solr的ReplicationHandler提供了一系列http命令（参数command），支持的可选值如下：<br />
	1）indexversion：slave从master获取最新的索引点信息。<br />
	2）filecontent：slave从master下载指定文件的内容。<br />
	3）filelist：slave从master获取指定indexversion的索引文件列表（及需要复制的配置文件）。<br />
	4）backup：备份索引。如果担心索引有损坏的可能性，可以定期备份索引。<br />
	5）fetchindex：手动复制数据，和slave自动复制相当。<br />
	6）disablepoll：停止slave的复制。<br />
	7）enablepoll：开启slave的复制。<br />
	8）abortfetch：终止slave上正在进行的下载文件过程。<br />
	9）commits：show当前仍旧保留的IndexCommit信息。<br />
	10）details：show slave当前的复制细节信息。<br />
	11）enablereplication：启动master对所有slave的复制功能<br />
	12）disablereplication：关闭master对所有slave的复制功能</p>
<h2>4、性能</h2>
<p>solr的ReplicationHandler使用http的分段连续的下载索引文件数据，而代替经典的rsync，solr wiki上给出的性能测试对比图如下：<br />
<a href="http://www.kafka0102.com/wp-content/uploads/2010/07/transfer_time.png"><img src="http://www.kafka0102.com/wp-content/uploads/2010/07/transfer_time.png" alt="" title="transfer_time" width="800" height="600" class="aligncenter size-full wp-image-245" /></a><br />
<br/><br />
可以看到，性能方面差别不大，不必有太多的担心。</p>
<h2>4、参考文章</h2>
<p>http://wiki.apache.org/solr/SolrReplication</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/07/244.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>zoie DocIDMapperImpl类实现分析</title>
		<link>http://www.kafka0102.com/2010/07/238.html</link>
		<comments>http://www.kafka0102.com/2010/07/238.html#comments</comments>
		<pubDate>Fri, 16 Jul 2010 20:55:36 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[bloom filter]]></category>
		<category><![CDATA[zoie]]></category>
		<category><![CDATA[源码分析]]></category>
		<category><![CDATA[算法]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=238</guid>
		<description><![CDATA[有网友留言询问我对zoie的DocIDMapperImpl实现是否有了解。说实话，之前看zoie也只是大面上的，知道DocIDMapperImpl的用处，但没有仔细分析它的算法。就趁着夜深人静，把这个类好好琢磨了下。但我得承认，看这种伤脑筋的算法让我有些吃不消，下面就列出我对它的大致分析，如果有不恰当的地方，也望指正。

还是说下DocIDMapperImpl的作用吧。在zoie中，uid和lucene的docid有一一对应关系。从docid到uid的映射很简单，就是分配个maxdoc大小的数组，索引位置是docid，值是uid。这样做也是因为docid是从小到大自增的，大小总有限。但uid是long型的，使用数组反映射是不行了，一个直接的选择是使用hashmap。不过zoie为了节约空间，使用了更有效的算法，也就是下面的类。]]></description>
			<content:encoded><![CDATA[<p>有网友留言询问我对zoie的DocIDMapperImpl实现是否有了解。说实话，之前看zoie也只是大面上的，知道DocIDMapperImpl的用处，但没有仔细分析它的算法。就趁着夜深人静，把这个类好好琢磨了下。但我得承认，看这种伤脑筋的算法让我有些吃不消，下面就列出我对它的大致分析，如果有不恰当的地方，也望指正。</p>
<p>还是说下DocIDMapperImpl的作用吧。在zoie中，uid和lucene的docid有一一对应关系。从docid到uid的映射很简单，就是分配个maxdoc大小的数组，索引位置是docid，值是uid。这样做也是因为docid是从小到大自增的，大小总有限。但uid是long型的，使用数组反映射是不行了，一个直接的选择是使用hashmap。不过zoie为了节约空间，使用了更有效的算法，也就是下面的类，这个算法有些像是bloom filter算法的变种应用。</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">&nbsp;
<span style="color: #000000; font-weight: bold;">package</span> <span style="color: #006699;">proj.zoie.api.impl</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">java.util.Arrays</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">java.util.HashMap</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">proj.zoie.api.DocIDMapper</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">proj.zoie.api.ZoieIndexReader</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #008000; font-style: italic; font-weight: bold;">/**
 * @author ymatsuda
 * 
 */</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> DocIDMapperImpl <span style="color: #000000; font-weight: bold;">implements</span> DocIDMapper <span style="color: #009900;">&#123;</span>
&nbsp;
	<span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> _docArray<span style="color: #339933;">;</span>
	<span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">long</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> _uidArray<span style="color: #339933;">;</span>
	<span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> _start<span style="color: #339933;">;</span>
	<span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">long</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> _filter<span style="color: #339933;">;</span>
	<span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> _mask<span style="color: #339933;">;</span>
	<span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> MIXER <span style="color: #339933;">=</span> <span style="color: #cc66cc;">2147482951</span><span style="color: #339933;">;</span> <span style="color: #666666; font-style: italic;">// a prime number</span>
&nbsp;
	<span style="color: #008000; font-style: italic; font-weight: bold;">/**
	 * 
	 * @param uidArray uidArray的大小是索引的maxdoc，所以数组的每个索引位置
	 * 表示docid，值表示uid，如果docid被删除，
	 * 其索引位置的值为ZoieIndexReader.DELETED_UID
	 */</span>
	<span style="color: #000000; font-weight: bold;">public</span> DocIDMapperImpl<span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">long</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> uidArray<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #000066; font-weight: bold;">int</span> len <span style="color: #339933;">=</span> uidArray.<span style="color: #006633;">length</span><span style="color: #339933;">;</span>
		<span style="color: #000066; font-weight: bold;">int</span> mask <span style="color: #339933;">=</span> len <span style="color: #339933;">/</span> <span style="color: #cc66cc;">4</span><span style="color: #339933;">;</span>
		mask <span style="color: #339933;">|=</span> <span style="color: #009900;">&#40;</span>mask <span style="color: #339933;">&gt;&gt;</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		mask <span style="color: #339933;">|=</span> <span style="color: #009900;">&#40;</span>mask <span style="color: #339933;">&gt;&gt;</span> <span style="color: #cc66cc;">2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		mask <span style="color: #339933;">|=</span> <span style="color: #009900;">&#40;</span>mask <span style="color: #339933;">&gt;&gt;</span> <span style="color: #cc66cc;">4</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		mask <span style="color: #339933;">|=</span> <span style="color: #009900;">&#40;</span>mask <span style="color: #339933;">&gt;&gt;</span> <span style="color: #cc66cc;">8</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		mask <span style="color: #339933;">|=</span> <span style="color: #009900;">&#40;</span>mask <span style="color: #339933;">&gt;&gt;</span> <span style="color: #cc66cc;">16</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		_mask <span style="color: #339933;">=</span> mask<span style="color: #339933;">;</span>
	<span style="color: #666666; font-style: italic;">//上面的操作，首先是取mask为len的1/4,之后做了联合的右移及或操作，</span>
	<span style="color: #666666; font-style: italic;">//使得mask最高有效位右边的位值都变为1,也就是说，假如mask开始等于0x10110000，</span>
	<span style="color: #666666; font-style: italic;">//操作后变成0x11111111，这才能mask可以和下面的h做与操作定位到_filter数组中</span>
	<span style="color: #666666; font-style: italic;">//的某个索引位置。也可以看到，mask的大小介于len的1/4到1/2。</span>
		_filter <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #000066; font-weight: bold;">long</span><span style="color: #009900;">&#91;</span>mask <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
&nbsp;
		<span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">long</span> uid <span style="color: #339933;">:</span> uidArray<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>uid <span style="color: #339933;">!=</span> ZoieIndexReader.<span style="color: #006633;">DELETED_UID</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #000066; font-weight: bold;">int</span> h <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>uid <span style="color: #339933;">&gt;&gt;&gt;</span> <span style="color: #cc66cc;">32</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">^</span> uid<span style="color: #009900;">&#41;</span> <span style="color: #339933;">*</span> MIXER<span style="color: #339933;">;</span>
	<span style="color: #666666; font-style: italic;">//这个hash值算法目的是将uid能散到int的整个数值范围内，并降低h之间的冲突，</span>
	<span style="color: #666666; font-style: italic;">//所以计算后得到的h会比较大。</span>
				<span style="color: #000066; font-weight: bold;">long</span> bits <span style="color: #339933;">=</span> _filter<span style="color: #009900;">&#91;</span>h <span style="color: #339933;">&amp;</span> _mask<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
				bits <span style="color: #339933;">|=</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>1L <span style="color: #339933;">&lt;&lt;</span> <span style="color: #009900;">&#40;</span>h <span style="color: #339933;">&gt;&gt;&gt;</span> <span style="color: #cc66cc;">26</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
				bits <span style="color: #339933;">|=</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>1L <span style="color: #339933;">&lt;&lt;</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>h <span style="color: #339933;">&gt;&gt;</span> <span style="color: #cc66cc;">20</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&amp;</span> 0x3F<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
				_filter<span style="color: #009900;">&#91;</span>h <span style="color: #339933;">&amp;</span> _mask<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> bits<span style="color: #339933;">;</span>
	<span style="color: #666666; font-style: italic;">//(h &gt;&gt;&gt; 26)得到的是h高位的前5位，再经过1L &lt;&lt; 后，其取值范围就是0-31;</span>
	<span style="color: #666666; font-style: italic;">//(1L &lt;&lt; ((h &gt;&gt; 20) &amp; 0x3F))取值范围是0-63。</span>
	<span style="color: #666666; font-style: italic;">//这两个或操作正好取了bits的位数范围中的两位。</span>
	<span style="color: #666666; font-style: italic;">//bits的两个或操作，相当于bloom filter中两次hash取位。</span>
	<span style="color: #666666; font-style: italic;">//对于bloom filter算法，判定key是否存在是有误判的可能性，这里也不意外。</span>
	<span style="color: #666666; font-style: italic;">//因为bits有64位，而每个uid取两位，mask最坏是len的1/2，在hash散均的情况下，</span>
	<span style="color: #666666; font-style: italic;">//这个_filter每个桶(索引位置）冲突率不会很大。</span>
			<span style="color: #009900;">&#125;</span>
		<span style="color: #009900;">&#125;</span>
&nbsp;
		_start <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #000066; font-weight: bold;">int</span><span style="color: #009900;">&#91;</span>_mask <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span> <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
		len <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
		<span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">long</span> uid <span style="color: #339933;">:</span> uidArray<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>uid <span style="color: #339933;">!=</span> ZoieIndexReader.<span style="color: #006633;">DELETED_UID</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
				_start<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>uid <span style="color: #339933;">&gt;&gt;&gt;</span> <span style="color: #cc66cc;">32</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">^</span> uid<span style="color: #009900;">&#41;</span> <span style="color: #339933;">*</span> MIXER<span style="color: #009900;">&#41;</span> <span style="color: #339933;">&amp;</span> _mask<span style="color: #009900;">&#93;</span><span style="color: #339933;">++;</span>
				len<span style="color: #339933;">++;</span>
			<span style="color: #009900;">&#125;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #000066; font-weight: bold;">int</span> val <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span>
		<span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> i <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> i <span style="color: #339933;">&lt;</span> _start.<span style="color: #006633;">length</span><span style="color: #339933;">;</span> i<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			val <span style="color: #339933;">+=</span> _start<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
			_start<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> val<span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
		_start<span style="color: #009900;">&#91;</span>_mask<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> len<span style="color: #339933;">;</span>
<span style="color: #666666; font-style: italic;">//_start经过了两个循环处理，第一个循环计算出_start每个桶中保存了多少个uid，</span>
<span style="color: #666666; font-style: italic;">//并计算出有效的uid个数len。第二个循环是为下面的操作做准备，它使得_start中每个桶</span>
<span style="color: #666666; font-style: italic;">//保存的是从第0个桶到当前桶有多少有效的uid。</span>
&nbsp;
		<span style="color: #000066; font-weight: bold;">long</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> partitionedUidArray <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #000066; font-weight: bold;">long</span><span style="color: #009900;">&#91;</span>len<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
		<span style="color: #000066; font-weight: bold;">int</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> docArray <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #000066; font-weight: bold;">int</span><span style="color: #009900;">&#91;</span>len<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
&nbsp;
		<span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">long</span> uid <span style="color: #339933;">:</span> uidArray<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>uid <span style="color: #339933;">!=</span> ZoieIndexReader.<span style="color: #006633;">DELETED_UID</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #000066; font-weight: bold;">int</span> i <span style="color: #339933;">=</span> <span style="color: #339933;">--</span><span style="color: #009900;">&#40;</span>_start<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>uid <span style="color: #339933;">&gt;&gt;&gt;</span> <span style="color: #cc66cc;">32</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">^</span> uid<span style="color: #009900;">&#41;</span> <span style="color: #339933;">*</span> MIXER<span style="color: #009900;">&#41;</span> <span style="color: #339933;">&amp;</span> _mask<span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
				partitionedUidArray<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> uid<span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #000066; font-weight: bold;">int</span> s <span style="color: #339933;">=</span> _start<span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
		<span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> i <span style="color: #339933;">=</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span> i <span style="color: #339933;">&lt;</span> _start.<span style="color: #006633;">length</span><span style="color: #339933;">;</span> i<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000066; font-weight: bold;">int</span> e <span style="color: #339933;">=</span> _start<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
			<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>s <span style="color: #339933;">&lt;</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #003399;">Arrays</span>.<span style="color: #006633;">sort</span><span style="color: #009900;">&#40;</span>partitionedUidArray, s, e<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span>
			s <span style="color: #339933;">=</span> e<span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #666666; font-style: italic;">//这两个循环来填充partitionedUidArray数组，并调整_start保存的计数，</span>
	<span style="color: #666666; font-style: italic;">//这个计数就是partitionedUidArray数组的索引位置的偏小临近值。</span>
	<span style="color: #666666; font-style: italic;">//注意对_start的--操作和s &lt; e的判断，这是处理一个桶里存在多个uid的情况，</span>
	<span style="color: #666666; font-style: italic;">//以保证partitionedUidArray中uid的顺序，也使得_start相邻两个桶的计数会有差值。</span>
	<span style="color: #666666; font-style: italic;">//所以当可以利用二分查找来搜索_uidArray和_docArray。</span>
		<span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> docid <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> docid <span style="color: #339933;">&lt;</span> uidArray.<span style="color: #006633;">length</span><span style="color: #339933;">;</span> docid<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000066; font-weight: bold;">long</span> uid <span style="color: #339933;">=</span> uidArray<span style="color: #009900;">&#91;</span>docid<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
			<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>uid <span style="color: #339933;">!=</span> ZoieIndexReader.<span style="color: #006633;">DELETED_UID</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> p <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>uid <span style="color: #339933;">&gt;&gt;&gt;</span> <span style="color: #cc66cc;">32</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">^</span> uid<span style="color: #009900;">&#41;</span> <span style="color: #339933;">*</span> MIXER<span style="color: #009900;">&#41;</span> <span style="color: #339933;">&amp;</span> _mask<span style="color: #339933;">;</span>
				<span style="color: #000066; font-weight: bold;">int</span> idx <span style="color: #339933;">=</span> findIndex<span style="color: #009900;">&#40;</span>partitionedUidArray, uid, _start<span style="color: #009900;">&#91;</span>p<span style="color: #009900;">&#93;</span>,
						_start<span style="color: #009900;">&#91;</span>p <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
				<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>idx <span style="color: #339933;">&gt;=</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
					docArray<span style="color: #009900;">&#91;</span>idx<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> docid<span style="color: #339933;">;</span>
				<span style="color: #009900;">&#125;</span>
			<span style="color: #009900;">&#125;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #666666; font-style: italic;">//填充docArray</span>
		_uidArray <span style="color: #339933;">=</span> partitionedUidArray<span style="color: #339933;">;</span>
		_docArray <span style="color: #339933;">=</span> docArray<span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #008000; font-style: italic; font-weight: bold;">/**
	 * @see 分析出构造函数后，这个函数就比较好理解了。这里就不细说了。
	 */</span>
	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">int</span> getDocID<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">long</span> uid<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> h <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>uid <span style="color: #339933;">&gt;&gt;&gt;</span> <span style="color: #cc66cc;">32</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">^</span> uid<span style="color: #009900;">&#41;</span> <span style="color: #339933;">*</span> MIXER<span style="color: #339933;">;</span>
		<span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> p <span style="color: #339933;">=</span> h <span style="color: #339933;">&amp;</span> _mask<span style="color: #339933;">;</span>
&nbsp;
		<span style="color: #666666; font-style: italic;">// check the filter</span>
		<span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">long</span> bits <span style="color: #339933;">=</span> _filter<span style="color: #009900;">&#91;</span>p<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
		<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>bits <span style="color: #339933;">&amp;</span> <span style="color: #009900;">&#40;</span>1L <span style="color: #339933;">&lt;&lt;</span> <span style="color: #009900;">&#40;</span>h <span style="color: #339933;">&gt;&gt;&gt;</span> <span style="color: #cc66cc;">26</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">0</span>
				<span style="color: #339933;">||</span> <span style="color: #009900;">&#40;</span>bits <span style="color: #339933;">&amp;</span> <span style="color: #009900;">&#40;</span>1L <span style="color: #339933;">&lt;&lt;</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>h <span style="color: #339933;">&gt;&gt;</span> <span style="color: #cc66cc;">20</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&amp;</span> 0x3F<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span>
			<span style="color: #000000; font-weight: bold;">return</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
&nbsp;
		<span style="color: #666666; font-style: italic;">// do binary search in the partition</span>
		<span style="color: #000066; font-weight: bold;">int</span> begin <span style="color: #339933;">=</span> _start<span style="color: #009900;">&#91;</span>p<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
		<span style="color: #000066; font-weight: bold;">int</span> end <span style="color: #339933;">=</span> _start<span style="color: #009900;">&#91;</span>p <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">-</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
		<span style="color: #666666; font-style: italic;">// we have some uids in this partition, so we assume (begin &lt;= end)</span>
		<span style="color: #000000; font-weight: bold;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">true</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000066; font-weight: bold;">int</span> mid <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>begin <span style="color: #339933;">+</span> end<span style="color: #009900;">&#41;</span> <span style="color: #339933;">&gt;&gt;&gt;</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
			<span style="color: #000066; font-weight: bold;">long</span> midval <span style="color: #339933;">=</span> _uidArray<span style="color: #009900;">&#91;</span>mid<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
&nbsp;
			<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>midval <span style="color: #339933;">==</span> uid<span style="color: #009900;">&#41;</span>
				<span style="color: #000000; font-weight: bold;">return</span> _docArray<span style="color: #009900;">&#91;</span>mid<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
			<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>mid <span style="color: #339933;">==</span> end<span style="color: #009900;">&#41;</span>
				<span style="color: #000000; font-weight: bold;">return</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
&nbsp;
			<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>midval <span style="color: #339933;">&lt;</span> uid<span style="color: #009900;">&#41;</span>
				begin <span style="color: #339933;">=</span> mid <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
			<span style="color: #000000; font-weight: bold;">else</span>
				end <span style="color: #339933;">=</span> mid<span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span>
&nbsp;
	<span style="color: #008000; font-style: italic; font-weight: bold;">/**
	 * @see 在arr的一个区间内二分查找uid所在的索引位置。
	 * @param arr
	 * @param uid
	 * @param begin
	 * @param end
	 * @return
	 */</span>
	<span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">int</span> findIndex<span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">long</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> arr, <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #000066; font-weight: bold;">long</span> uid,
			<span style="color: #000066; font-weight: bold;">int</span> begin, <span style="color: #000066; font-weight: bold;">int</span> end<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>begin <span style="color: #339933;">&gt;=</span> end<span style="color: #009900;">&#41;</span>
			<span style="color: #000000; font-weight: bold;">return</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
		end<span style="color: #339933;">--;</span>
&nbsp;
		<span style="color: #000000; font-weight: bold;">while</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">true</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000066; font-weight: bold;">int</span> mid <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>begin <span style="color: #339933;">+</span> end<span style="color: #009900;">&#41;</span> <span style="color: #339933;">&gt;&gt;&gt;</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
			<span style="color: #000066; font-weight: bold;">long</span> midval <span style="color: #339933;">=</span> arr<span style="color: #009900;">&#91;</span>mid<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
			<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>midval <span style="color: #339933;">==</span> uid<span style="color: #009900;">&#41;</span>
				<span style="color: #000000; font-weight: bold;">return</span> mid<span style="color: #339933;">;</span>
			<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>mid <span style="color: #339933;">==</span> end<span style="color: #009900;">&#41;</span>
				<span style="color: #000000; font-weight: bold;">return</span> <span style="color: #339933;">-</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
&nbsp;
			<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>midval <span style="color: #339933;">&lt;</span> uid<span style="color: #009900;">&#41;</span>
				begin <span style="color: #339933;">=</span> mid <span style="color: #339933;">+</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
			<span style="color: #000000; font-weight: bold;">else</span>
				end <span style="color: #339933;">=</span> mid<span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>   就时间复杂度来说，DocIDMapperImpl和hashmap相当（在hash均匀情况下，那个二分查找次数通常不会很多）。就空间复杂度来说，DocIDMapperImpl中的_uidArray和_docArray相当于hashmap中项的kye和value集合。DocIDMapperImpl中还有的是_start和_filter，而hashmap中每个项还有hash值和项冲突时的next引用以及需要大于负载因子的额外空间。在mask等于1/2 len的最坏情况下，DocIDMapperImpl也是要优于hashmap的。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/07/238.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>分享“Top 10 Performance Problems taken from Zappos, Monster, Thomson and Co”</title>
		<link>http://www.kafka0102.com/2010/07/234.html</link>
		<comments>http://www.kafka0102.com/2010/07/234.html#comments</comments>
		<pubDate>Mon, 12 Jul 2010 13:36:23 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[architecture]]></category>
		<category><![CDATA[分享]]></category>
		<category><![CDATA[架构]]></category>
		<category><![CDATA[经验]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=234</guid>
		<description><![CDATA[Top 10 Performance Problems taken from Zappos, Monster, Thomson and Co 一文总结了一些关于性能问题方面的经验，虽然不是很“新奇”，但也算是中规中矩的有借鉴意义，这里分享之。没有对原文做完全的翻译，也欢迎大家直接看看原文，本文转过来还夹杂了一些个人理解，有理解不当的地方也望指正。]]></description>
			<content:encoded><![CDATA[<p><a href="http://blog.dynatrace.com/2010/06/15/top-10-performance-problems-taken-from-zappos-monster-and-co/" target="_blank">Top 10 Performance Problems taken from Zappos, Monster, Thomson and Co</a> 一文总结了一些关于性能问题方面的经验，虽然不是很“新奇”，但也算是中规中矩的有借鉴意义，这里分享之。没有对原文做完全的翻译，也欢迎大家直接看看原文，本文转过来还夹杂了一些个人理解，有理解不当的地方也望指正。</p>
<h3>1、Too Many Database Calls</h3>
<p>这一问题是说在一次请求/事务处理过程中有太多次的数据库调用。典型的情景是：<br />
1）请求了不必要的数据，比如程序员们为图省事而”select * ”出不必要的字段数据。<br />
2）相同的数据被请求多次。这通常出现在同一事务中，彼此独立的组件需要请求相同的数据，而在不同的上下文下，程序员又没有检查出重复的执行情况。<br />
3）为了获取特定的数据而使用了多次查询，这通常是没有有效的使用复杂的SQL或存储过程造成的（但话说两端，在一些反schema的表结构中，将复杂逻辑拆分成多条SQL是自然的）。<br />
进一步阅读：<a href="http://blog.dynatrace.com/2009/04/30/linq2sql-prevent-performance-issues-when-operating-on-multiple-rows-with-stored-procedures" target="_blank">Blog on Linq2Sql Performance Issues on Database</a>,  <a href="http://blog.dynatrace.com/2009/10/16/video-on-common-performance-antipatterns-online/" target="_blank">Video on Performance Anti-Patterns</a></p>
<h3>2、Synchronized to Death</h3>
<p>在应用中使用同步来保护共享数据是无可厚非的。不过，程序员们在使用同步时，往往没有经过深思熟虑而不恰当地增大了同步代码段的范围。因为同步引起的性能问题一般在低负载的测试环境中没有体现，但在高负载的生产环境就表现出性能及可扩展性问题。<br />
进一步阅读：<a href="http://blog.dynatrace.com/2009/04/02/performance-analysis-how-to-identify-synchronization-issues-under-load/" target="_blank">How to identify synchronization problems under load</a></p>
<h3>3、Too chatty on the remoting channels</h3>
<p>就开发来说，使用封装的API来处理远程调用就像处理本地调用一样简单。不过，它显然会比本地调用带来更多的问题，比如延迟性、数据的序列化、网络流量、内存使用等等。当应用的远程调用层次过多，这些开销就需要考虑。这让我想起EJB2盛行的时候，即便是很普通的web应用也要整没用的分布式，活活把性能降低好几个级别。<br />
进一步阅读：<a href="http://blog.dynatrace.com/2009/09/28/performance-considerations-in-distributed-applications/" target="_blank">Performance Considerations in Distributed Applications</a></p>
<h3>4、Wrong usage of O/R-Mappers</h3>
<p>ORM很好，能简化程序员的开发负担，不过它的复杂性会带来一些性能问题。像Hibernate等框架，通过优化配置及使用一些高级的trick，可以避免性能陷阱，不过这就对使用者有很高的技能要求。否则，对于有性能要求的应用，还是对ORM的使用谨慎些。<br />
进一步阅读：<a href="http://blog.dynatrace.com/2009/02/16/understanding-caching-in-hibernate-part-one-the-session-cache/" target="_blank">Understanding Hibernate Session Cache</a>, <a href="http://blog.dynatrace.com/2009/02/16/understanding-caching-in-hibernate-part-two-the-query-cache/" target="_blank">Understanding the Query Cache</a>, <a href="http://blog.dynatrace.com/2009/03/24/understanding-caching-in-hibernate-part-three-the-second-level-cache/" target="_blank">Understanding the Second Level Cache</a></p>
<h3>5、Memory Leaks</h3>
<p>尽管像Java和.NET平台有垃圾回收器来管理内存，使程序员摆脱了C/C++中的内存泄漏噩梦。不过内存泄漏仍是可能的，如果应用抛出了OutOfMemoryException，那就需要好好看看哪些不需要的引用没有被释放掉。<br />
进一步阅读：<a href="http://blog.dynatrace.com/2010/03/03/week-5-hunting-lost-treasures-understanding-and-finding-memory-leaks/" target="_blank">Understanding and finding Memory Leaks</a></p>
<h3>6、Problematic 3rd Party Code/Components</h3>
<p>在应用中使用第三方软件是很正常的，尤其是对于热衷于使用开源软件的Java社区。在选择第三方软件时，要谨慎的选择成熟稳定并经过细致调研的软件，而很多使用第三方软件引发的问题往往是使用者没有正确使用造成的。<br />
进一步阅读：<a href="http://blog.dynatrace.com/2010/03/18/how-to-avoid-the-top-5-sharepoint-performance-mistakes/" target="_blank">Top SharePoint Performance Mistakes</a></p>
<h3>7、Wasteful handling of scarce resources</h3>
<p>对于像内存、CPU、IO、文件句柄等资源，不正确地或浪费性的使用它们会导致性能及可扩展性问题。这方面的例子太多了，比如使用内存池、线程池、连接池等都是为了高效的使用这些资源。<br />
进一步阅读：<a href="http://blog.dynatrace.com/2009/02/25/resource-leak-detection-in-net-applications/" target="_blank">Resource Leak detection in .NET Applications</a></p>
<h3>8、Bloated web frontends</h3>
<p>当应用出现性能问题时，我们往往将注意力集中在后端存储访问方面，而忽视对前端的优化。现在对前端的优化有很多现成的经验，并且可以高快好省地提高效果，比如雅虎的优化原则、《高性能网站建设进阶指南》等优秀图书，都是前端开发人员需要掌握和应用的。<br />
进一步阅读：<a href="http://blog.dynatrace.com/2010/04/21/how-better-caching-helps-frankfurts-airport-website-to-handle-additional-load-caused-by-the-volcano/" target="_blank">How Better Caching would help speed up Frankfurt  Airport Web Site</a></p>
<h3>9、Wrong Cache Strategy leads to excessive Garbage Collection</h3>
<p>对于解决存储访问的性能问题，很多时候都是在DB前架上Cache。不过，不恰当的Cache策略可能并不会带来明显的效果，所以需要对Cache策略及效果做好评估和验证。如果是在Java程序中使用Cache，就更要注意GC问题。<br />
进一步阅读：<a href="http://blog.dynatrace.com/2009/08/13/java-memory-problems/" target="_blank">Java Memory Problems</a>, <a href="http://blog.dynatrace.com/2009/04/08/performance-analysis-identify-gc-bottlenecks-in-distributed-heterogeneous-environments/" target="_blank">Identify GC Bottlenecks in Distributed Applications</a></p>
<h3>10、Intermittent Problems</h3>
<p>断断续续的问题。想让程序没有bug是很困难的，即便是很小的二分查找程序，也会在某个边界条件下出现异常。我们总能在生产环境中隔三差五的发现程序的问题，有的甚至初看起来是诡异的。要保证程序质量，就需要在单元测试、功能测试、覆盖率测试、性能测试等环节上做足功夫，尽早发现问题，尽管这样也不能保证程序在生产环境就没有诡异的问题。<br />
进一步阅读：<a href="http://blog.dynatrace.com/2009/12/02/tracing-intermittent-errors-guest-blog-by-lucy-monahan-from-novell/" target="_blank">Tracing Intermittent Errors by Lucy Monahan from Novell</a>,  <a href="http://blog.dynatrace.com/2009/01/07/how-to-find-invisible-performance-problems/" target="_blank">How to find invisible performance problems</a></p>
<h3>11、(Bonus Problem) Expensive Serialization</h3>
<p>题目说好是10个问题的，不过作者又买10赠1,搭了个零头。前面有提到远程调用是昂贵的，这里着重说的是传输数据的序列化和反序列化开销。一些复杂或者低效的传输协议和数据包格式（比如SOAP协议、XML格式等），可能就会产生严重的性能问题。<br />
进一步阅读：<a href="http://blog.dynatrace.com/2009/09/28/performance-considerations-in-distributed-applications/" target="_blank">Performance Considerations in Distributed Applications</a></p>
<p>最后，很推荐有兴趣的点击各条中的“进一步阅读”的相关链接，有的还不错。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/07/234.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>避免lucene queryparser中文分词的缺陷</title>
		<link>http://www.kafka0102.com/2010/07/226.html</link>
		<comments>http://www.kafka0102.com/2010/07/226.html#comments</comments>
		<pubDate>Sat, 10 Jul 2010 06:01:19 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[lucene]]></category>
		<category><![CDATA[queryparser]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=226</guid>
		<description><![CDATA[很多人在使用lucene时会使用其提供的queryparser分析query。不过，lucene的queryparser从一开始到现在都没有充分考虑中文等语言的特点，使得查询中文会出现让人不可理解的查不到结果的情况。这个bug就是LUCENE-2458 。]]></description>
			<content:encoded><![CDATA[<p>很多人在使用lucene时会使用其提供的queryparser分析query。不过，lucene的queryparser从一开始到现在都没有充分考虑中文等语言的特点，使得查询中文会出现让人不可理解的查不到结果的情况。这个bug就是<a href="https://issues.apache.org/jira/browse/LUCENE-2458" target="_blank">LUCENE-2458 </a>。</p>
<p>这个问题简单说来就是，对于一个连续的中文query，queryparser将Analyzer返回的Term序列构成了PhraseQuery（也有可能是MultiPhraseQuery），而PhraseQuery默认的匹配规则是要求Term序列在索引的文档中完全顺序匹配。这对于英文查询来说是可以接受的，因为queryparser在分析query时，首先通过AND、OR、NOT将query进行切分（这可以理解为queryparser 的第一层分析，这样切分后构成的TOP Query就是BooleanQuery），然后将切分后的subquery交由Analyzer分析（当然这要求是满足FieldQuery的情况，否则也可能是RangeQuery、WildcardQuery等），因为英文单词之间以空格分割，相当于OR查询，所以英文中的subquery就可以理解是个短语（比如由多个连字符连接的短语，或者是英文和数字接合的短语，在lucene查询语法中，显示的双引号之间的内容认为是短语）。但对中文来说，如果将subquery分析成PhraseQuery，就很成问题。比如subquery是”诺基亚N97“，如果构成PhraseQuery，则要求索引的文档中必须存在”诺基亚N97“，如果”诺基亚“和”N97“中间有其他词，就不算匹配。对于这个例子，是可以调整PhraseQuery的 slop参数来变相解决，但这种情况，使用AND BooleanQuery更合适，使用BooleanQuery在对文档打分上也要比PhraseQuery好很多。而对于query分词结果，也存在一些TermQuery之间是OR的情况，使用PhraseQuery显然也不合适。</p>
<p>如LUCENE-2458提到，这个bug会在3.1和4中被修复，修复方法是，只有显示通过双引号括起来的subquery才生成 PhraseQuery，否则可以派生子类来自定义处理。就目前使用来说，如果你使用IK做Analyzer，那么它提供的IKQueryParser是很好的替代方案，它构造的就是由AND和OR联合的BooleanQuery。但因为BooleanQuery没有考虑各个Term在文档中的位置关系，一味的根据词频计算得分，检索效果有时也不是很好。不知道大家是怎么处理的？我有想到去扩展它的Query和Scorer，不过看起来有些麻烦，暂且还没精力投入上去。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/07/226.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>对mongodb进行java编程</title>
		<link>http://www.kafka0102.com/2010/07/209.html</link>
		<comments>http://www.kafka0102.com/2010/07/209.html#comments</comments>
		<pubDate>Sat, 03 Jul 2010 17:16:59 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[nosql]]></category>
		<category><![CDATA[mongodb]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=209</guid>
		<description><![CDATA[	本周实验性地使用上mongodb，应用场景很简单，所以现在对mongodb了解也不是很深入。本文主要介绍mongodb的java客户端编程，这方面的内容也很简单，这里只是做个总结。不得不说，像mongodb这种介于kv和sql之间的存储，对很多的互联网应用很合适。mongodb现在的应用案例已经很多，并且社区的活跃度很高（国内也有不少人对其有很深的研究，如果有时间和精力，或许我也会投入一些对mongodb的研究），很值得期待。]]></description>
			<content:encoded><![CDATA[<p>	本周实验性地使用上mongodb，应用场景很简单，所以现在对mongodb了解也不是很深入。本文主要介绍mongodb的java客户端编程，这方面的内容也很简单，这里只是做个总结。不得不说，像mongodb这种介于kv和sql之间的存储，对很多的互联网应用很合适。mongodb现在的应用案例已经很多，并且社区的活跃度很高（国内也有不少人对其有很深的研究，如果有时间和精力，或许我也会投入一些对mongodb的研究），很值得期待。</p>
<p>	言归正传，下面总结下使用Java开发mongodb应用的一些点滴。在Java中和mongodb交互的最直接的选择就是使用MongoDB Java Driver，其下载地址是：http://github.com/mongodb/mongo-java-driver/downloads。总的来说，在Java中操作mongodb的API还是很简洁，下面对其一些常见的使用做些介绍。</p>
<h2>1、连接数据库</h2>
<p>	和mongodb建立连接的示例代码如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">	Mongo m <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Mongo<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;localhost&quot;</span>,<span style="color: #cc66cc;">27017</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	DB db <span style="color: #339933;">=</span> m.<span style="color: #006633;">getDB</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;db_test&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>	尽管这里获得了表示mongodb的db_test数据库连接的对象db，但这时并没有真正和mongodb建立连接，所以即便这时数据库没起来也不会抛出异常，尽管你还是需要catch它的实例化过程。mongodb的java driver对连接做了池化处理，所以应用中只需要实例化一个Mongo对象即可，对它的操作是线程安全的，这对开发使用来说真的是很方便。</p>
<h2>2、取得DBCollection</h2>
<p>	mongodb中的collection在Java中使用DBCollection表示（这是一个抽象类，尽管你不必需要知道），创建DBCollection实例也是一行代码，和创建DB实例一样，这个操作并不涉及真正的和数据库之间的通信。</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">	DBCollection coll <span style="color: #339933;">=</span> db.<span style="color: #006633;">getCollection</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;collection1&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>	要获得类似mysql中“show tables”功能，可以使用如下代码：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">	Set<span style="color: #339933;">&lt;</span>String<span style="color: #339933;">&gt;</span> colls <span style="color: #339933;">=</span> db.<span style="color: #006633;">getCollectionNames</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #003399;">String</span> s <span style="color: #339933;">:</span> colls<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
	    <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>s<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span></pre></div></div>

<h2>3、插入文档</h2>
<p>	mongodb存储JSON格式的文档，而在Java中表示这种数据格式的最简便的类就是Map了。MongoDB Java Driver中提供的BasicDBObject就是个Map（它继承自LinkedHashMap并实现DBObject接口），它会将Map中的数据转换成BSON格式传输到mongodb。下面是插入文档的示例：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">	DBCollection coll <span style="color: #339933;">=</span> db.<span style="color: #006633;">getCollection</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;collection1&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	BasicDBObject doc <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> BasicDBObject<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	doc.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;name&quot;</span>, <span style="color: #0000ff;">&quot;kafka0102&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	doc.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;age&quot;</span>, <span style="color: #cc66cc;">28</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	doc.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;time&quot;</span>, <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">Date</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	coll.<span style="color: #006633;">insert</span><span style="color: #009900;">&#40;</span>doc<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>	mongodb中每个插入的文档会产生个唯一标识_id。当调用coll.insert(doc);时，driver会检查其中是否有_id字段，如果没有则自动生成ObjectId实例来作为_id的值，这个ObjectId由4部分编码而成：当前时间、机器标识、进程号和自增的整数。<br />
	insert函数也支持插入文档列表：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">insert<span style="color: #009900;">&#40;</span>List<span style="color: #339933;">&lt;</span>DBObject<span style="color: #339933;">&gt;</span> list<span style="color: #009900;">&#41;</span></pre></div></div>

<p>而提交操作也有update( DBObject q , DBObject o )、remove( DBObject o )。</p>
<h2>4、查询文档</h2>
<h3>4.1、findOne</h3>
<p>	findOne是查询满足条件的第一条记录（不意味着数据库满足条件的只有一条记录），查询条件使用DBObject表示，示例如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">	DBCollection coll <span style="color: #339933;">=</span> db.<span style="color: #006633;">getCollection</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;collection1&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	BasicDBObject cond <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> BasicDBObject<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	cond.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;name&quot;</span>, <span style="color: #0000ff;">&quot;kafka0102&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	cond.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;age&quot;</span>, <span style="color: #cc66cc;">28</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	DBObject ret <span style="color: #339933;">=</span> coll.<span style="color: #006633;">findOne</span><span style="color: #009900;">&#40;</span>cond<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>ret<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>	返回结果是个DBObject，可以通过get(key)来取值。对于查询条件，可以通过嵌套多层来表示复杂的格式，比如：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">	query <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> BasicDBObject<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        query.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;i&quot;</span>, <span style="color: #000000; font-weight: bold;">new</span> BasicDBObject<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;$gt&quot;</span>, <span style="color: #cc66cc;">50</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>  <span style="color: #666666; font-style: italic;">// e.g. find all where i &gt; 50</span></pre></div></div>

<h3>4.2、find</h3>
<p>	find函数是查询集合的，它返回的DBCursor是DBObject的迭代器，使用示例如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">	DBCollection coll <span style="color: #339933;">=</span> db.<span style="color: #006633;">getCollection</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;collection1&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	BasicDBObject cond <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> BasicDBObject<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	cond.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;i&quot;</span>, <span style="color: #000000; font-weight: bold;">new</span> BasicDBObject<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;$gt&quot;</span>, <span style="color: #cc66cc;">20</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">append</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;$lte&quot;</span>, <span style="color: #cc66cc;">30</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	DBCursor ret <span style="color: #339933;">=</span> coll.<span style="color: #006633;">find</span><span style="color: #009900;">&#40;</span>cond<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #000000; font-weight: bold;">while</span><span style="color: #009900;">&#40;</span>ret.<span style="color: #006633;">hasNext</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
	   <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>ret.<span style="color: #006633;">next</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span></pre></div></div>

<h2>5、使用索引</h2>
<p>	创建索引语句如：coll.createIndex(new BasicDBObject(&#8220;i&#8221;, 1)); ，其中i表示要索引的字段，1表示升序（-1表示降序）。可以看到，DBObject成为java客户端通用的结构表示。查看索引使用DBCollection.getIndexInfo()函数。</p>
<h2>6、MongoDB Java Driver的并发性</h2>
<p>	前面提到，Java MongoDB Driver使用了连接的池化处理，这个连接池默认是保持10个连接，可以通过Option进行修改，在应用中使用Mongo的一个实例即可。连接池中的每个连接使用DBPort结构表示（而不是DBCollection），并寄存于DBPortPool中，所以对DBCollection的操作并不意味着使用同一个连接。如果在应用的一次请求过程中，需要保证使用同一个连接，可以使用下面的代码片断：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">	DB db...<span style="color: #339933;">;</span>
	db.<span style="color: #006633;">requestStart</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #666666; font-style: italic;">//code....</span>
	db.<span style="color: #006633;">requestDone</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>	在requestStart和requestDone之间使用的连接就不是来自于DBPortPool，而是当前线程中的ThreadLocal<MyPort>结构变量（MyPort中保持了DBPort成员）。</p>
<h2>7、其他选择</h2>
<p>	尽管Java mongodb driver很不错，但就像很多人不使用JDBC而使用一些ORM框架，mongodb的java客户端也有其他的选择。<br />
	1）对POJO和DAO的支持。对于那些热衷ORM的人来说，Morphia（http://code.google.com/p/morphia/wiki/QuickStart）是个不错的选择，它通过在POJO中添加注释来实现映射，并提供对DAO的CRUD操作的支持。<br />
	2）对DSL的支持。Sculptor就是这样的东西，使用者编写中立的DSL文件，Sculptor将其翻译成代码。这通常不具有吸引力，除非是多语言的应用，能将DSL翻译成多种编程语言，否则除了增加学习成本，没什么收益。<br />
	3）对JDBC的支持。mongo-jdbc是这样的东西，但现在还是实验性质的。它或许是想亲近Java程序员，不过它显然不能完全兼容JDBC，而很多Java程序员对JDBC也并不感冒，所以它不是很值得使用。</p>
<h2>8、参考资料</h2>
<p>1、http://www.mongodb.org/display/DOCS/Java+Tutorial<br />
2、http://api.mongodb.org/java/2.0/index.html</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/07/209.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>使用依赖注入框架Google Guice替代new和工厂类</title>
		<link>http://www.kafka0102.com/2010/06/193.html</link>
		<comments>http://www.kafka0102.com/2010/06/193.html#comments</comments>
		<pubDate>Fri, 25 Jun 2010 20:07:22 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[framework]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[di]]></category>
		<category><![CDATA[guice]]></category>
		<category><![CDATA[ioc]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=193</guid>
		<description><![CDATA[	在新系统中使用了Guice来统一了对象注入和创建方式，也不是刻意为之，而是由两个经典问题引起的：
	1）对于对象的创建，可以使用new、静态工厂、抽象工厂等方式，使得对象创建方式不统一，而使用哪种方式最好也很难有定论，并且随着代码的变化，创建方式也可能会需要有更好的变化。
	2）对于类成员的设置，最直接的是在类内部直接创建，但这样硬编码会影响灵活的测试性，解决的方法是通过外部注入成员，好处自不必说，但在一个复杂的系统中，类之间的关联及层次关系，使得一个上层对象的创建往往需要依赖多个下层的类，使得对象创建代码变的冗余而复杂。
	解决上面两个问题，我就想到了依赖注入（DI）框架（也被成为IOC容器，取控制反转之意），我需要选择一个得手的依赖注入框架来满足需求。]]></description>
			<content:encoded><![CDATA[<h2>1、使用依赖注入</h2>
<p>	在新系统中使用了Guice来统一了对象注入和创建方式，也不是刻意为之，而是由两个经典问题引起的：<br />
	1）对于对象的创建，可以使用new、静态工厂、抽象工厂等方式，使得对象创建方式不统一，而使用哪种方式最好也很难有定论，并且随着代码的变化，创建方式也可能会需要有更好的变化。<br />
	2）对于类成员的设置，最直接的是在类内部直接创建，但这样硬编码会影响灵活的测试性，解决的方法是通过外部注入成员，好处自不必说，但在一个复杂的系统中，类之间的关联及层次关系，使得一个上层对象的创建往往需要依赖多个下层的类，使得对象创建代码变的冗余而复杂。<br />
	解决上面两个问题，我就想到了依赖注入（DI）框架（也被成为IOC容器，取控制反转之意），我需要选择一个得手的依赖注入框架来满足需求。</p>
<h2>2、选择google guice</h2>
<p>	说到依赖注入框架，最大牌的就是Spring了。我很久之前也搞过一点Spring，使用上也很简单，但我最终选择了google的Guice。如果非要说个理由，一是Guice使用也很简单，二是Spring有点大并且它原生是需要个配置文件而我不想要配置文件（Spring2引入注释后，配置方面也可以很简化，应该也可以抛弃配置文件直接代码绑定，但这种非主流做法我也不曾验证过其有多麻烦），三是我想搞搞没搞过的。<br />
	Guice是google的”疯狂的Bob“开发的，开源后也有不错的活跃度。Bob当初没有选择Spring而是另造轮子，原因也主要是Spring配置的繁琐（代码和配置的分离会对测试维护等造成麻烦）、Spring相对的低性能（对大多数应用来说，Spring的性能是可以接受的说，尽管Guice性能确实比Spring好很多）。但轮子既然造出来并发布出来，自然会有人将其和Spring做对比。这方面我不想牵涉更多精力，其实两个选择都不错，了解它们各自的特点，就自己的应用需求，选择更合适的即可。就大多数应用系统（尤其是SSH配套系统），Spring还是不错的选择。再有，从发展前景来说，火爆成熟并有专业公司推动的Spring无疑会比形单影只的Guice会更有发展。如果真就同类DI框架间比较，倒可以将Guice和老牌的PicoContainer做些比较（PicoContainer和Spring在同一时期诞生，到现在也是不温不火，但还在持续更新中）。</p>
<h2>3、使用google guice</h2>
<h3>3.1、基本使用</h3>
<p>    好吧，你不要嫌我罗嗦，我只能假定你对Guice是个新手并真的对它有些兴趣，所以亲手写出示例代码来说明Guice的使用及其特点。先上代码，看下很简单的示例代码：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">interface</span> Foo <span style="color: #009900;">&#123;</span>
   <span style="color: #000066; font-weight: bold;">void</span> foo<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">interface</span> Bar <span style="color: #009900;">&#123;</span>
   <span style="color: #000066; font-weight: bold;">void</span> bar<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> FooImpl1 <span style="color: #000000; font-weight: bold;">implements</span> Foo <span style="color: #009900;">&#123;</span>
   <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> foo<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
       <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;FooImpl1&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
   <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> BarImpl1 <span style="color: #000000; font-weight: bold;">implements</span> Bar <span style="color: #009900;">&#123;</span>
   <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> bar<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
       <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;BarImpl1&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
   <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> BeanService1 <span style="color: #009900;">&#123;</span>
   <span style="color: #000000; font-weight: bold;">private</span> Foo foo<span style="color: #339933;">;</span>
   <span style="color: #000000; font-weight: bold;">private</span> Bar bar<span style="color: #339933;">;</span>
   @Inject
   <span style="color: #000000; font-weight: bold;">public</span> BeanService1<span style="color: #009900;">&#40;</span>Foo foo, Bar bar<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
       <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">foo</span> <span style="color: #339933;">=</span> foo<span style="color: #339933;">;</span>
       <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">bar</span> <span style="color: #339933;">=</span> bar<span style="color: #339933;">;</span>
   <span style="color: #009900;">&#125;</span>
   @Override
   <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">String</span> toString<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
       <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #0000ff;">&quot;BeanService1 [bar=&quot;</span> <span style="color: #339933;">+</span> bar <span style="color: #339933;">+</span> <span style="color: #0000ff;">&quot;, foo=&quot;</span> <span style="color: #339933;">+</span> foo <span style="color: #339933;">+</span> <span style="color: #0000ff;">&quot;]&quot;</span><span style="color: #339933;">;</span>
   <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> BeanService1Module <span style="color: #000000; font-weight: bold;">implements</span> Module <span style="color: #009900;">&#123;</span>
       <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> configure<span style="color: #009900;">&#40;</span>Binder binder<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
           binder.<span style="color: #006633;">bind</span><span style="color: #009900;">&#40;</span>Foo.<span style="color: #000000; font-weight: bold;">class</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">to</span><span style="color: #009900;">&#40;</span>FooImpl1.<span style="color: #000000; font-weight: bold;">class</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
           binder.<span style="color: #006633;">bind</span><span style="color: #009900;">&#40;</span>Bar.<span style="color: #000000; font-weight: bold;">class</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">to</span><span style="color: #009900;">&#40;</span>BarImpl1.<span style="color: #000000; font-weight: bold;">class</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
       <span style="color: #009900;">&#125;</span>
   <span style="color: #009900;">&#125;</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">       Injector injector <span style="color: #339933;">=</span> Guice.<span style="color: #006633;">createInjector</span><span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> BeanService1Module<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
       BeanService1 bs <span style="color: #339933;">=</span> injector.<span style="color: #006633;">getInstance</span><span style="color: #009900;">&#40;</span>BeanService1.<span style="color: #000000; font-weight: bold;">class</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
       <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>bs<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>    上面的示例没什么逻辑可言，BeanService1有两个成员变量，通过构造函数注入。这里使用Guice来创建BeanService1对象的工作有3个：<br />
    1）使用注释@Inject来标识BeanService1中需要被注入的成员，这里使用构造函数方式注入。<br />
    2）创建实现了Module接口的BeanService1Module，Module接口只有一个void configure(Binder binder)，在这个函数中可以做绑定操作，比如将Foo接口和FooImpl1绑定起来，这使得Guice在运行时动态创建BeanService1对象时，当调用其被标识为@Inject的构造函数时，会查找参数列表成员是否有绑定的类型。<br />
    3）使用Guice.createInjector(Module&#8230; module)函数来创建Injector，Injector就相当于Factory，后续就可以调用其getInstance创建对象。<br />
    很简单吧，使用Guice不会比自己写工厂方法麻烦多少，下面再具体介绍其注入和绑定的其他方式。</p>
<h3>3.2、注入和绑定的方式</h3>
<p>    除了上面提到的使用@Inject注入构造函数的方式，Guice还支持另两种常用的方式：<br />
    2）@Inject到类的成员变量，上面的例子就可修改成：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> BeanService1 <span style="color: #009900;">&#123;</span>
   @Inject <span style="color: #000000; font-weight: bold;">private</span> Foo foo<span style="color: #339933;">;</span>
   @Inject <span style="color: #000000; font-weight: bold;">private</span> Bar bar<span style="color: #339933;">;</span>
&nbsp;
   <span style="color: #000000; font-weight: bold;">public</span> BeanService1<span style="color: #009900;">&#40;</span>Foo foo, Bar bar<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
       <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">foo</span> <span style="color: #339933;">=</span> foo<span style="color: #339933;">;</span>
       <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">bar</span> <span style="color: #339933;">=</span> bar<span style="color: #339933;">;</span>
   <span style="color: #009900;">&#125;</span>
&nbsp;
   <span style="color: #000000; font-weight: bold;">public</span> BeanService1<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
   <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>    当然，这次需要个默认构造函数的。<br />
    3）@Inject到类的方法，上面的例子就可修改成：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> BeanService1 <span style="color: #009900;">&#123;</span>
   <span style="color: #000000; font-weight: bold;">private</span> Foo foo<span style="color: #339933;">;</span>
   <span style="color: #000000; font-weight: bold;">private</span> Bar bar<span style="color: #339933;">;</span>
&nbsp;
   <span style="color: #000000; font-weight: bold;">public</span> BeanService1<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
   <span style="color: #009900;">&#125;</span>
   @Inject
   <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> setFoo<span style="color: #009900;">&#40;</span>Foo foo<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
       <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">foo</span> <span style="color: #339933;">=</span> foo<span style="color: #339933;">;</span>
   <span style="color: #009900;">&#125;</span>
   @Inject
   <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> setBar<span style="color: #009900;">&#40;</span>Bar bar<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
       <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">bar</span> <span style="color: #339933;">=</span> bar<span style="color: #339933;">;</span>
   <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>    注释完了，就需要一种绑定关系，以使Guice能确定该如何注入。上面示例中的绑定方式是最常见的，就是将一个接口绑定到一个实现类上。Guice还支持的绑定如下：<br />
    2）绑定自身。像BeanService1是个实现类而没有实现什么接口，它当然也可能被其他类注入，可以使用 binder.bind(BeanService1.class);绑定自身，尽管这样做没什么意义，对于注入的类参数，Guice识别出来后会直接创建。<br />
    3）绑定注释和实例。如果被注入的是如String、int这样的基本类型，需要做两件事情：一是对被注入的参数加上名称注释@Named，如下所示：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">  @Inject
   <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> setName<span style="color: #009900;">&#40;</span>@Named<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;beanService1Name&quot;</span><span style="color: #009900;">&#41;</span> <span style="color: #003399;">String</span> name<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
       <span style="color: #000000; font-weight: bold;">this</span>.<span style="color: #006633;">name</span> <span style="color: #339933;">=</span> name<span style="color: #339933;">;</span>
   <span style="color: #009900;">&#125;</span></pre></div></div>

<p>二是调用 bind(String.class).annotatedWith(Names.named(&#8220;beanService1Name&#8221;)).toInstance(“kafka0102”); 来将实例值和注释绑定上。对于@Named，它可作用于任何类型的变量，所以，如果某个被注入的参数需要指定为特定的对象，可以使用该方式。而对于基本类型，最好都指定@Named，以避免之间的冲突。<br />
    4）绑定Provider。有些时候对象的创建是不适合使用@Injectt注入的，比如被创建的对象的构造依赖于复杂的外部环境，再比如需要被构造的对象来自于第三方库。此时可以有两种解决方法，一是在AbstractModule的子类中提供@Provides方法，示例如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> BeanService1Module <span style="color: #000000; font-weight: bold;">implements</span> Module <span style="color: #009900;">&#123;</span>
		<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> configure<span style="color: #009900;">&#40;</span>Binder binder<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			binder.<span style="color: #006633;">bind</span><span style="color: #009900;">&#40;</span>Bar.<span style="color: #000000; font-weight: bold;">class</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">to</span><span style="color: #009900;">&#40;</span>BarImpl1.<span style="color: #000000; font-weight: bold;">class</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
		@Provides
		<span style="color: #000000; font-weight: bold;">public</span> Foo provideFoo<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #666666; font-style: italic;">//create...</span>
			<span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span></pre></div></div>

<p>	另一种方法是提供实现了接口</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">interface</span> Provider<span style="color: #339933;">&lt;</span>T<span style="color: #339933;">&gt;</span> <span style="color: #009900;">&#123;</span>
  		T get<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span></pre></div></div>

<p>	的实现类，并通过诸如binder.bind(Foo.class).toProvider(FooProvider.class);来绑定。<br />
	对于构造函数注入、成员变量注入、成员方法注入这三种经典的注入方式，选择上来说，构造函数注入最直接明了，推荐使用，有时构造函数中需要做些初始化操作，这时其他两种注入方式就不能胜任；对于成员变量注入，这有点反模式的味道，尤其是对于私有成员来说，不建议使用；使用成员方法注入也是很好的方式，在一些情况下（比如被注入的方法在父类中存在），使用方法注入会更合适。</p>
<h3>3.3、其他</h3>
<p>	1、如果被注入的是范型类型的类（比如List<String>），Guice提供了TypeLiteral来创建绑定，比如示例代码binder.bind(new TypeLiteral<List<String>>() {}).toInstance(new ArrayList<String>());<br />
	2、Guice对被被注入的参数要求不能为null，如果有可接受null的需求，可以对参数提供注释@Nullable解决。<br />
	3、对创建的实例，Guice默认都是new一个新的，可以对类指定如@Singleton、@SessionScoped、@RequestScoped这样的Scope来表示创建什么范围的实例（也可以通过binder.bind(Bar.class).to(BarImpl1.class).in(Singleton.class);这样来解决）。<br />
	4、Guice在2.0版本中提供了对AOP的支持，就像其他IOC容器做的那样。但是，我觉得AOP和IOC没什么联系，当初Spring做了两者，结果后来的IOC容器都多多少少提供了对AOP的支持。简单看了Guice的AOP介绍，只是实现了粗糙的拦截器模式。如果有对AOP的需要，选择Spring或者AspectJ才是王道。<br />
	5、目前Guice的最新版本是2.0,其网站http://code.google.com/p/google-guice有着较为详实的使用文档并且还有一些原理方面的介绍。并且，在其wiki给出的链接中竟然有关于Guice的图书，会有人买吗？</p>
<h3>3.4、实践经验</h3>
<p>	对Guice的使用，除了散落在各类中的注释，和Guice有关联的代码就是实现Moudle和使用Injector。对于实现Moudle，我原以为不同Moudle中的bind具有独立的作用域，但实践的效果是各Module configure的Binder应该是全局一个的（原理上可能并不如此）。所以，只需要实现一个Module绑定所有类型即可。这样，完全可以定义一个工厂，其只提供一个方法：<T> T newInstance(Class<T> c);，然后实现一个依赖于Guice的工厂类。如果将来需要迁移Guice到Spring或者手工创建，只需要换个实现工厂类即可。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/06/193.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>分享“The Four Meta Secrets of Scaling at Facebook”</title>
		<link>http://www.kafka0102.com/2010/06/187.html</link>
		<comments>http://www.kafka0102.com/2010/06/187.html#comments</comments>
		<pubDate>Sun, 20 Jun 2010 16:13:56 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[architecture]]></category>
		<category><![CDATA[Facebook]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=187</guid>
		<description><![CDATA[在highscalability.com上翻看最新的文章，一文 The Four Meta Secrets of Scaling at Facebook 很不错。下面做些简要的中文介绍并续上自己的解读。 Facebook的规模及技术水平不消我多说，其对开源社区的贡献更是有目共睹。“The Four Meta Secrets of Scaling at Facebook”不是关于Facebook的技术架构方面的细节，而是在扩展并发展Facebook这样的大规模高复杂增长速度快的网站过程中，积累的 4点有借鉴意义的观点。]]></description>
			<content:encoded><![CDATA[<p>在highscalability.com上翻看最新的文章，一文 <a href="http://highscalability.com/blog/2010/6/10/the-four-meta-secrets-of- scaling-at-facebook.html" target="_blank">The Four Meta Secrets of Scaling at  Facebook </a>很不错。下面做些简要的中文介绍并续上自己的解读。Facebook的规模及技术水平不消我多说，其对开源社区的贡献更是有目共睹。“The Four Meta Secrets of Scaling at  Facebook”不是关于Facebook的技术架构方面的细节，而是在扩展并发展Facebook这样的大规模高复杂增长速度快的网站过程中，积累的4点有借鉴意义的观点。</p>
<h2>1、<strong> Scaling takes Iteration</strong></h2>
<p>Facebook也是从小网站发展起来的，尽管它的发展速度快得惊人，而它的技术也随着产品的发展不断迭代，尽管其间也有着适应性的阵痛。选择技术，不必选择最好的，要选择最合适的。以Facebook为例，它应该是这个星球上最大规模使用PHP的网站了。PHP的简单使其成为多数网站的选择，但上点规模的网站都会为PHP的低性能而困扰。以发展眼光来看，网站初期使用PHP仍是很好的选择，当它不再适用网站规模时，再去考虑如何扩展它。</p>
<p>Facebook的图片系统的衍变过程也是个极好的例子。关于Facebook的图片系统，很早之前就有相关文章的介绍。就它的图片系统来说，技术含金量 不能说很闪亮，但效果上应该是满足它的需要。Facebook的图片系统主要经历了3次衍变。第一阶段：图片以NFS方式存储，元数据保存在Mysql 中，读写都是直接来的，很简单的文件存储实现方式。第二阶段：数据量和访问量激增，需要对系统做优化，前端加上CDN，后端加上 Memcached，但存储结构不做改变。第三阶段：尽管CDN和Cache的效果很好，但NFS不适合存储大规模的小文件，这使得Facebook开发 了自己的自定义格式的文件存储&#8211;<span style="color: #000000;"><span style="font-family: Calibri;"><span style="font-size: small;">Haystack</span></span></span>（详情请参考文章 http://perspectives.mvdirona.com/2008/06/30/FacebookNeedleInAHaystackEfficientStorageOfBillionsOfPhotos.aspx ）。</p>
<h2><strong>2.  Don&#8217;t Over Design</strong></h2>
<p>过渡设计是技术人员经常犯的毛病。由于对技术的偏爱，技术人员往往会不顾实际的需求而不合理的使用技术。在做新系统时，技术人员往往是凭经验或是现有产品的 影响力来评估新系统上线后的规模，并在做设计时会再锦上添花的过渡考虑它的规模，结果使用了最快的语言（比如C）、最快的存储系统（比如自己山寨的文件系 统），把系统的设计、实现做得很复杂，但系统的灵活性却变得很差。而往往糟糕的情况是，系统上线后发现效果不好，结果产品的功能需要做很多改变或改进，但 技术上的复杂性使得系统不能轻易的应对变化，最终产品因为过渡设计变得积重难返。</p>
<p>像Facebook这样的规模，相信很多公司不会使用PHP+Memcached+Mysql这样的大众化组合，但Facebook至今在总体架构上还保 持着这样的简单性。当PHP的低性能不能满足Facebook，它没有转向像C/C++这样性能好但开发效率低的语言，而是对PHP做hack，开发出 HipHop。使用PHP的网站朋友们，可以跟下HipHop，这个东西做好了会是解决PHP性能问题的杀手锏。</p>
<h2><strong>3.  Choose the right tool for the job, but realize that your choice comes  with overhead</strong>.</h2>
<p>一说到技术选型，我就想到山寨一词。山寨的轮子无论在大公司还是小公司都多多少少存在，而惯见的理由是，自己造的轮子技术自有，有问题自己能解决，有需求 自己可添加。而看开源社区，轮子也是多得是，但如果能青出于蓝胜于蓝，这样的轮子也值得鼓励。最怕的是，造轮子的人对已有轮子的优缺点都没有搞清楚，就 开始瞎搞一个，大公司有人力有资源还好说，小公司就很难经得起这种折腾，造轮子容易，维护轮子可能就麻烦了。话又说回来，想要一个轮子完美的适合自己也不 太现实，了解轮子的优缺点，发挥其优点，规避或改进其缺点是更好的选择。对于那些成熟的开源工具，我们在评估它们时往往会觉得它们太庞大太重，总想造个轻 型的适合自己的轮子；但当自己的轮子被更大范围的使用，结果也会发现它也在变庞大变重。</p>
<p>在技术选型方面，可以尝试新的东西，但在做大规模转型时要慎重，要做好充分的技术评估。像在Java社区，SSH取代之前的EJB大行其道， 但这样的一个技术栈显然要比JSP+Servlet+JDBC复杂很多，并且从中能有多少收益也值得商榷，而盲目的追求技术的先进性而不考虑其成本及团队的人力，结果技术本身成为产品成长的负担。像这 两年风风火火的NOSQL，可选的实现层出不穷，但现实的说，解决同样的问题，如果能把Mysql用好，那些NOSQL产品是不必要的。当然，如果团队中 有NOSQL方面的人才，自然可以针对应用场景做最合适的选择。就像PHP、Python、Ruby、Java等都能做网站，都有成功的案例，但是一个团 队到底使用什么开发语言，还是综合考量的结果。</p>
<h2><strong>4. Get the culture right</strong></h2>
<p>建设好的开发团队绝对可以大书特书，并且困扰着诸多团队的管理层。团队文化这里提到3点：Move  fast（能“敏捷”的应对变化，包括产品的功能方面，技术适应网站规模方面等等）、Huge  Impact（在技术团队里，有时人多不一定好办事，重要的是要精兵良将，甚至有时个人英雄能够带动整个团队）、Be  bold（对技术人员来说，要勇于试错，敢于尝试新东西，也敢于创造新东西，但不要盲目）。</p>
<p>个人感觉，好的团队要能激发团队人员的热情，能给团员发挥的空间和时间。像在Facebook，一个人并不是固定成某个不变的角色，每个人都可以接触系统 的方方面面，提出自己的想法，并有机会改进现有系统的某些方面。如果把技术人员固定在自己的一亩三分地，往往使得团队人员之间没什么技术融合，团队成员的 技术面狭窄，整个团队的战斗力也会下降。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/06/187.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Netty实现原理浅析</title>
		<link>http://www.kafka0102.com/2010/06/167.html</link>
		<comments>http://www.kafka0102.com/2010/06/167.html#comments</comments>
		<pubDate>Sat, 19 Jun 2010 20:21:20 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[framework]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[netty]]></category>
		<category><![CDATA[nio framework]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=167</guid>
		<description><![CDATA[Netty是JBoss出品的高效的Java NIO开发框架，关于其使用，可参考我的另一篇文章 netty使用初步。本文将主要分析Netty实现方面的东西，由于精力有限，本人并没有对其源码做了极细致的研 究。如果下面的内容有错误或不严谨的地方，也请指正和谅解。对于Netty使用者来说，Netty提供了几个典型的example，并有详尽的API doc和guide doc，本文的一些内容及图示也来自于Netty的文档，特此致谢。]]></description>
			<content:encoded><![CDATA[<p>Netty是JBoss出品的高效的Java  NIO开发框架，关于其使用，可参考我的另一篇文章<a href="http://www.kafka0102.com/2010/06/netty%E4%BD%BF%E7%94%A8%E5%88%9D%E6%AD%A5/" target="_blank"> netty使用初步</a>。本文将主要分析Netty实现方面的东西，由于精力有限，本人并没有对其源码做了极细致的研 究。如果下面的内容有错误或不严谨的地方，也请指正和谅解。对于Netty使用者来说，Netty提供了几个典型的example，并有详尽的API  doc和guide doc，本文的一些内容及图示也来自于Netty的文档，特此致谢。</p>
<h2>1、总体结构</h2>
<p style="text-align: left;"><a href="http://www.kafka0102.com/wp-content/uploads/2010/06/architecture.png"><img class="aligncenter size-full wp-image-168" title="architecture" src="http://www.kafka0102.com/wp-content/uploads/2010/06/architecture.png" alt="" width="605" height="287" /></a></p>
<p style="text-align: left;">先放上一张漂亮的Netty总体结构图，下面的内容也主要围绕该图上的一些核心功能做分析，但对如Container  Integration及Security Support等高级可选功能，本文不予分析。</p>
<h2>2、网络模型</h2>
<p>Netty是典型的Reactor模型结构，关于Reactor的详尽阐释，可参考POSA2,这里不做概念性的解释。而应用Java  NIO构建Reactor模式，Doug Lea（就是那位让人无限景仰的大爷）在“<a href="http://gee.cs.oswego.edu/dl/cpjslides/nio.pdf" target="_blank">Scalable IO in  Java</a>”中给了很好的阐述。这里截取其PPT中经典的图例说明 Reactor模式的典型实现：</p>
<p>1、这是最简单的单Reactor单线程模型。Reactor线程是个多面手，负责多路分离套接字，Accept新连接，并分派请求到处理器链中。该模型 适用于处理器链中业务处理组件能快速完成的场景。不过，这种单线程模型不能充分利用多核资源，所以实际使用的不多。</p>
<p style="text-align: center;"><a href="http://www.kafka0102.com/wp-content/uploads/2010/06/reactor1.png"><img class="aligncenter size-full wp-image-170" title="reactor1" src="http://www.kafka0102.com/wp-content/uploads/2010/06/reactor1.png" alt="" width="470" height="256" /></a></p>
<p>2、相比上一种模型，该模型在处理器链部分采用了多线程（线程池），也是后端程序常用的模型。</p>
<p style="text-align: center;"><a href="http://www.kafka0102.com/wp-content/uploads/2010/06/reactor2.png"><img class="aligncenter size-full wp-image-172" title="reactor2" src="http://www.kafka0102.com/wp-content/uploads/2010/06/reactor2.png" alt="" width="582" height="395" /></a></p>
<p style="text-align: left;">3、 第三种模型比起第二种模型，是将Reactor分成两部分，mainReactor负责监听server  socket，accept新连接，并将建立的socket分派给subReactor。subReactor负责多路分离已连接的socket，读写网 络数据，对业务处理功能，其扔给worker线程池完成。通常，subReactor个数上可与CPU个数等同。<br />
<a href="http://www.kafka0102.com/wp-content/uploads/2010/06/reactor3.png"><img class="aligncenter size-full wp-image-173" title="reactor3" src="http://www.kafka0102.com/wp-content/uploads/2010/06/reactor3.png" alt="" width="573" height="389" /></a></p>
<p>说完Reacotr模型的三种形式，那么Netty是哪种呢？其实，我还有一种Reactor模型的变种没说，那就是去掉线程池的第三种形式的变种，这也 是Netty  NIO的默认模式。在实现上，Netty中的Boss类充当mainReactor，NioWorker类充当subReactor（默认 NioWorker的个数是Runtime.getRuntime().availableProcessors()）。在处理新来的请求 时，NioWorker读完已收到的数据到ChannelBuffer中，之后触发ChannelPipeline中的ChannelHandler流。</p>
<p>Netty是事件驱动的，可以通过ChannelHandler链来控制执行流向。因为ChannelHandler链的执行过程是在 subReactor中同步的，所以如果业务处理handler耗时长，将严重影响可支持的并发数。这种模型适合于像Memcache这样的应用场景，但 对需要操作数据库或者和其他模块阻塞交互的系统就不是很合适。Netty的可扩展性非常好，而像ChannelHandler线程池化的需要，可以通过在 ChannelPipeline中添加Netty内置的ChannelHandler实现类&#8211;ExecutionHandler实现，对使用者来说只是 添加一行代码而已。对于ExecutionHandler需要的线程池模型，Netty提供了两种可 选：1） MemoryAwareThreadPoolExecutor 可控制Executor中待处理任务的上限（超过上限时，后续进来的任务将被阻 塞），并可控制单个Channel待处理任务的上限；2） OrderedMemoryAwareThreadPoolExecutor 是  MemoryAwareThreadPoolExecutor 的子类，它还可以保证同一Channel中处理的事件流的顺序性，这主要是控制事件在异步处 理模式下可能出现的错误的事件顺序，但它并不保证同一Channel中的事件都在一个线程中执行（通常也没必要）。一般来 说，OrderedMemoryAwareThreadPoolExecutor 是个很不错的选择，当然，如果有需要，也可以DIY一个。</p>
<h2>3、 buffer</h2>
<p>org.jboss.netty.buffer包的接口及类的结构图如下：</p>
<p style="text-align: center;"><a href="http://www.kafka0102.com/wp-content/uploads/2010/06/channelbuffer.png"><img class="aligncenter size-full wp-image-174" title="channelbuffer" src="http://www.kafka0102.com/wp-content/uploads/2010/06/channelbuffer.png" alt="" width="801" height="384" /></a></p>
<p>该包核心的接口是ChannelBuffer和ChannelBufferFactory,下面予以简要的介绍。</p>
<p>Netty使用ChannelBuffer来存储并操作读写的网络数据。ChannelBuffer除了提供和ByteBuffer类似的方法，还提供了 一些实用方法，具体可参考其API文档。ChannelBuffer的实现类有多个，这里列举其中主要的几个：</p>
<p>1）HeapChannelBuffer：这是Netty读网络数据时默认使用的ChannelBuffer，这里的Heap就是Java堆的意思，因为 读SocketChannel的数据是要经过ByteBuffer的，而ByteBuffer实际操作的就是个byte数组，所以 ChannelBuffer的内部就包含了一个byte数组，使得ByteBuffer和ChannelBuffer之间的转换是零拷贝方式。根据网络字 节续的不同，HeapChannelBuffer又分为BigEndianHeapChannelBuffer和 LittleEndianHeapChannelBuffer，默认使用的是BigEndianHeapChannelBuffer。Netty在读网络 数据时使用的就是HeapChannelBuffer，HeapChannelBuffer是个大小固定的buffer，为了不至于分配的Buffer的 大小不太合适，Netty在分配Buffer时会参考上次请求需要的大小。</p>
<p>2）DynamicChannelBuffer：相比于HeapChannelBuffer，DynamicChannelBuffer可动态自适应大 小。对于在DecodeHandler中的写数据操作，在数据大小未知的情况下，通常使用DynamicChannelBuffer。</p>
<p>3）ByteBufferBackedChannelBuffer：这是directBuffer，直接封装了ByteBuffer的 directBuffer。</p>
<p>对于读写网络数据的buffer，分配策略有两种：1）通常出于简单考虑，直接分配固定大小的buffer，缺点是，对一些应用来说这个大小限制有时是不 合理的，并且如果buffer的上限很大也会有内存上的浪费。2）针对固定大小的buffer缺点，就引入动态buffer，动态buffer之于固定 buffer相当于List之于Array。</p>
<p>buffer的寄存策略常见的也有两种（其实是我知道的就限于此）：1）在多线程（线程池） 模型下，每个线程维护自己的读写buffer，每次处理新的请求前清空buffer（或者在处理结束后清空），该请求的读写操作都需要在该线程中完成。 2）buffer和socket绑定而与线程无关。两种方法的目的都是为了重用buffer。</p>
<p>Netty对buffer的处理策略是：读 请求数据时，Netty首先读数据到新创建的固定大小的HeapChannelBuffer中，当HeapChannelBuffer满或者没有数据可读 时，调用handler来处理数据，这通常首先触发的是用户自定义的DecodeHandler，因为handler对象是和ChannelSocket 绑定的，所以在DecodeHandler里可以设置ChannelBuffer成员，当解析数据包发现数据不完整时就终止此次处理流程，等下次读事件触 发时接着上次的数据继续解析。就这个过程来说，和ChannelSocket绑定的DecodeHandler中的Buffer通常是动态的可重用 Buffer（DynamicChannelBuffer），而在NioWorker中读ChannelSocket中的数据的buffer是临时分配的 固定大小的HeapChannelBuffer，这个转换过程是有个字节拷贝行为的。</p>
<p>对ChannelBuffer的创建，Netty内部使用的是ChannelBufferFactory接口，具体的实现有 DirectChannelBufferFactory和HeapChannelBufferFactory。对于开发者创建 ChannelBuffer，可使用实用类ChannelBuffers中的工厂方法。</p>
<h2>4、Channel</h2>
<p>和Channel相关的接口及类结构图如下：</p>
<p><a href="http://www.kafka0102.com/wp-content/uploads/2010/06/Channel.png"><img class="aligncenter size-full wp-image-175" title="Channel" src="http://www.kafka0102.com/wp-content/uploads/2010/06/Channel.png" alt="" width="539" height="253" /></a></p>
<p>从该结构图也可以看到，Channel主要提供的功能如下：</p>
<p>1）当前Channel的状态信息，比如是打开还是关闭等。<br />
2）通过ChannelConfig可以得到的Channel配置信息。<br />
3）Channel所支持的如read、write、bind、connect等IO操作。<br />
4）得到处理该Channel的ChannelPipeline，既而可以调用其做和请求相关的IO操作。</p>
<p>在Channel实现方面，以通常使用的nio  socket来说，Netty中的NioServerSocketChannel和NioSocketChannel分别封装了java.nio中包含的 ServerSocketChannel和SocketChannel的功能。</p>
<h2>5、ChannelEvent</h2>
<p>如前所述，Netty是事件驱动的，其通过ChannelEvent来确定事件流的方向。一个ChannelEvent是依附于Channel的 ChannelPipeline来处理，并由ChannelPipeline调用ChannelHandler来做具体的处理。下面是和 ChannelEvent相关的接口及类图：</p>
<p><a href="http://www.kafka0102.com/wp-content/uploads/2010/06/ChannelEvent.png"><img class="aligncenter size-full wp-image-176" title="ChannelEvent" src="http://www.kafka0102.com/wp-content/uploads/2010/06/ChannelEvent.png" alt="" width="416" height="437" /></a></p>
<p>对于使用者来说，在ChannelHandler实现类中会使用继承于ChannelEvent的MessageEvent，调用其 getMessage()方法来获得读到的ChannelBuffer或被转化的对象。</p>
<h2>6、ChannelPipeline</h2>
<p>Netty 在事件处理上，是通过ChannelPipeline来控制事件流，通过调用注册其上的一系列ChannelHandler来处理事件，这也是典型的拦截 器模式。下面是和ChannelPipeline相关的接口及类图：</p>
<p><a href="http://www.kafka0102.com/wp-content/uploads/2010/06/ChannelPipeline.png"><img class="aligncenter size-full wp-image-177" title="ChannelPipeline" src="http://www.kafka0102.com/wp-content/uploads/2010/06/ChannelPipeline.png" alt="" width="559" height="237" /></a></p>
<p>事件流有两种，upstream事件和downstream事件。在ChannelPipeline中，其可被注册的ChannelHandler既可以 是 ChannelUpstreamHandler 也可以是ChannelDownstreamHandler  ，但事件在ChannelPipeline传递过程中只会调用匹配流的ChannelHandler。在事件流的过滤器链 中，ChannelUpstreamHandler或ChannelDownstreamHandler既可以终止流程，也可以通过调用 ChannelHandlerContext.sendUpstream(ChannelEvent)或 ChannelHandlerContext.sendDownstream(ChannelEvent)将事件传递下去。下面是事件流处理的图示：</p>
<p><a href="http://www.kafka0102.com/wp-content/uploads/2010/06/ChannelPipeline.jpg"><img class="aligncenter size-full wp-image-178" title="ChannelPipeline" src="http://www.kafka0102.com/wp-content/uploads/2010/06/ChannelPipeline.jpg" alt="" width="522" height="622" /></a></p>
<p>从上图可见，upstream event是被Upstream Handler们自底向上逐个处理，downstream  event是被Downstream  Handler们自顶向下逐个处理，这里的上下关系就是向ChannelPipeline里添加Handler的先后顺序关系。简单的理 解，upstream event是处理来自外部的请求的过程，而downstream event是处理向外发送请求的过程。</p>
<p>服务端处 理请求的过程通常就是解码请求、业务逻辑处理、编码响应，构建的ChannelPipeline也就类似下面的代码片断：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">ChannelPipeline pipeline <span style="color: #339933;">=</span> Channels.<span style="color: #006633;">pipeline</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
pipeline.<span style="color: #006633;">addLast</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;decoder&quot;</span>, <span style="color: #000000; font-weight: bold;">new</span> MyProtocolDecoder<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
pipeline.<span style="color: #006633;">addLast</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;encoder&quot;</span>, <span style="color: #000000; font-weight: bold;">new</span> MyProtocolEncoder<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
pipeline.<span style="color: #006633;">addLast</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;handler&quot;</span>, <span style="color: #000000; font-weight: bold;">new</span> MyBusinessLogicHandler<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>其中，MyProtocolDecoder是ChannelUpstreamHandler类型，MyProtocolEncoder是 ChannelDownstreamHandler类型，MyBusinessLogicHandler既可以是 ChannelUpstreamHandler类型，也可兼ChannelDownstreamHandler类型，视其是服务端程序还是客户端程序以及 应用需要而定。</p>
<p>补充一点，Netty对抽象和实现做了很好的解耦。像org.jboss.netty.channel.socket包， 定义了一些和socket处理相关的接口，而org.jboss.netty.channel.socket.nio、 org.jboss.netty.channel.socket.oio等包，则是和协议相关的实现。</p>
<h2>7、codec  framework</h2>
<p>对于请求协议的编码解码，当然是可以按照协议格式自己操作ChannelBuffer中的字节数据。另一方面，Netty也做了几个很实用的codec  helper，这里给出简单的介绍。</p>
<p>1）FrameDecoder：FrameDecoder内部维护了一个 DynamicChannelBuffer成员来存储接收到的数据，它就像个抽象模板，把整个解码过程模板写好了，其子类只需实现decode函数即可。 FrameDecoder的直接实现类有两个：（1）DelimiterBasedFrameDecoder是基于分割符  （比如\r\n）的解码器，可在构造函数中指定分割符。（2）LengthFieldBasedFrameDecoder是基于长度字段的解码器。如果协 议 格式类似“内容长度”+内容、“固定头”+“内容长度”+动态内容这样的格式，就可以使用该解码器，其使用方法在API DOC上详尽的解释。<br />
2）ReplayingDecoder： 它是FrameDecoder的一个变种子类，它相对于FrameDecoder是非阻塞解码。也就是说，使用  FrameDecoder时需要考虑到读到的数据有可能是不完整的，而使用ReplayingDecoder就可以假定读到了全部的数据。<br />
3）ObjectEncoder 和ObjectDecoder：编码解码序列化的Java对象。<br />
4）HttpRequestEncoder和 HttpRequestDecoder：http协议处理。</p>
<p>下面来看使用FrameDecoder和ReplayingDecoder的两个例子：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> IntegerHeaderFrameDecoder <span style="color: #000000; font-weight: bold;">extends</span> FrameDecoder <span style="color: #009900;">&#123;</span>
		<span style="color: #000000; font-weight: bold;">protected</span> <span style="color: #003399;">Object</span> decode<span style="color: #009900;">&#40;</span>ChannelHandlerContext ctx, Channel channel,
				ChannelBuffer buf<span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">throws</span> <span style="color: #003399;">Exception</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>buf.<span style="color: #006633;">readableBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&amp;</span>lt<span style="color: #339933;">;</span> <span style="color: #cc66cc;">4</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
				<span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span>
			buf.<span style="color: #006633;">markReaderIndex</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
			<span style="color: #000066; font-weight: bold;">int</span> length <span style="color: #339933;">=</span> buf.<span style="color: #006633;">readInt</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
			<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>buf.<span style="color: #006633;">readableBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&amp;</span>lt<span style="color: #339933;">;</span> length<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
				buf.<span style="color: #006633;">resetReaderIndex</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
				<span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #339933;">;</span>
			<span style="color: #009900;">&#125;</span>
			<span style="color: #000000; font-weight: bold;">return</span> buf.<span style="color: #006633;">readBytes</span><span style="color: #009900;">&#40;</span>length<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span></pre></div></div>

<p>而使用ReplayingDecoder的解码片断类似下面的，相对来说会简化很多。</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> IntegerHeaderFrameDecoder2 <span style="color: #000000; font-weight: bold;">extends</span> ReplayingDecoder <span style="color: #009900;">&#123;</span>
		<span style="color: #000000; font-weight: bold;">protected</span> <span style="color: #003399;">Object</span> decode<span style="color: #009900;">&#40;</span>ChannelHandlerContext ctx, Channel channel,
				ChannelBuffer buf, VoidEnum state<span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">throws</span> <span style="color: #003399;">Exception</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000000; font-weight: bold;">return</span> buf.<span style="color: #006633;">readBytes</span><span style="color: #009900;">&#40;</span>buf.<span style="color: #006633;">readInt</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
	<span style="color: #009900;">&#125;</span></pre></div></div>

<p>就实现来说，当在ReplayingDecoder子类的decode函数中调用ChannelBuffer读数据时，如果读失败，那么 ReplayingDecoder就会catch住其抛出的Error，然后ReplayingDecoder接手控制权，等待下一次读到后续的数据后继 续decode。</p>
<h2>8、小结</h2>
<p>尽管该文行至此处将止，但该文显然没有将Netty实现原理深入浅出的说全说透。当我打算写这篇文章时，也是一边看Netty的代码，一边总结些可写的东 西，但前后断断续续，到最后都没了多少兴致。我还是爱做一些源码分析的事情，但精力终究有限，并且倘不能把源码分析的结果有条理的托出来，不能产生有意义 的心得，这分析也没什么价值和趣味。而就分析Netty代码的感受来说，Netty的代码很漂亮，结构上层次上很清晰，不过这种面向接口及抽象层次对代码 跟踪很是个问题，因为跟踪代码经常遇到接口和抽象类，只能借助于工厂类和API  DOC，反复对照接口和实现类的对应关系。就像几乎任何优秀的Java开源项目都会用上一系列优秀的设计模式，也完全可以从模式这一点单独拿出一篇分析文 章来，尽管我目前没有这样的想法。而在此文完成之后，我也没什么兴趣再看Netty的代码了。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/06/167.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Netty使用初步</title>
		<link>http://www.kafka0102.com/2010/06/161.html</link>
		<comments>http://www.kafka0102.com/2010/06/161.html#comments</comments>
		<pubDate>Sat, 19 Jun 2010 16:32:15 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[framework]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[netty]]></category>
		<category><![CDATA[nio framework]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=161</guid>
		<description><![CDATA[    Java1.4提供了NIO使开发者可以使用Java编写高性能的服务端程序，但使用原生的NIO API就像Linux C中网络编程一样，还是需要做IO处理、协议处理等低层次工作。所以，就像C服务端程序大量使用libevent作为网络应用框架一样，Java社区也不断涌现出基于NIO的网络应用框架。在这其中，Jboss出品的Netty就是个中翘楚。Netty是个异步的事件驱动网络应用框架，具有高性能、高扩展性等特性。Netty提供了统一的底层协议接口，使得开发者从底层的网络协议（比如TCP/IP、UDP）中解脱出来。就使用来说，开发者只要参考 Netty提供的若干例子和它的指南文档，就可以放手开发基于Netty的服务端程序了。
    在Java社区，最知名的开源Java NIO框架要属Mina和Netty，而且两者渊源颇多，对两者的比较自然不少。实际上，Netty的作者原来就是Mina作者之一，所以可以想到，Netty和Mina在设计理念上会有很多共同点。我对Mina没什么研究，但其作者介绍，Netty的设计对开发者有更友好的扩展性，并且性能方面要优于Mina，而Netty完善的文档也很吸引人。所以，如果你在寻找Java NIO框架，Netty是个很不错的选择。本文的内容就是围绕一个demo介绍使用Netty的点点滴滴。]]></description>
			<content:encoded><![CDATA[<h2>1、简介</h2>
<p>    Java1.4提供了NIO使开发者可以使用Java编写高性能的服务端程序，但使用原生的NIO API就像Linux C中网络编程一样，还是需要做IO处理、协议处理等低层次工作。所以，就像C服务端程序大量使用libevent作为网络应用框架一样，Java社区也不断涌现出基于NIO的网络应用框架。在这其中，Jboss出品的Netty就是个中翘楚。Netty是个异步的事件驱动网络应用框架，具有高性能、高扩展性等特性。Netty提供了统一的底层协议接口，使得开发者从底层的网络协议（比如TCP/IP、UDP）中解脱出来。就使用来说，开发者只要参考 Netty提供的若干例子和它的指南文档，就可以放手开发基于Netty的服务端程序了。</p>
<p>    在Java社区，最知名的开源Java NIO框架要属Mina和Netty，而且两者渊源颇多，对两者的比较自然不少。实际上，Netty的作者原来就是Mina作者之一，所以可以想到，Netty和Mina在设计理念上会有很多共同点。我对Mina没什么研究，但其作者介绍，Netty的设计对开发者有更友好的扩展性，并且性能方面要优于Mina，而Netty完善的文档也很吸引人。所以，如果你在寻找Java NIO框架，Netty是个很不错的选择。本文的内容就是围绕一个demo介绍使用Netty的点点滴滴。</p>
<h2>2、服务端程序</h2>
<h3>2.1、ChannelHandler</h3>
<p>     服务端程序通常的处理过程是：解码请求数据、业务逻辑处理、编码响应。从框架角度来说，可以提供3个接口来控制并调度该处理过程；从更通用的角度来说，并不特化处理其中的每一步，而把每一步当做过滤器链中的一环，这也是Netty的做法。Netty对请求处理过程实现了过滤器链模式（ChannelPipeline），每个过滤器实现了ChannelHandler接口。Netty中有两种请求事件流类型也做了细分：</p>
<p>    1）downstream event：其对应的ChannelHandler子接口是ChannelDownstreamHandler。downstream event是说从头到尾执行ChannelPipeline中的ChannelDownstreamHandler，这一过程相当于向外发送数据的过程。 downstream event有：&#8221;write&#8221;、&#8221;bind&#8221;、&#8221;unbind&#8221;、 &#8220;connect&#8221;、 &#8220;disconnect&#8221;、&#8221;close&#8221;。</p>
<p>    2）upstream event：其对应的ChannelHandler子接口是ChannelUpstreamHandler。upstream event处理的事件方向和downstream event相反，这一过程相当于接收处理外来请求的过程。upstream event有：&#8221;messageReceived&#8221;、 &#8220;exceptionCaught&#8221;、&#8221;channelOpen&#8221;、&#8221;channelClosed&#8221;、 &#8220;channelBound&#8221;、&#8221;channelUnbound&#8221;、 &#8220;channelConnected&#8221;、&#8221;writeComplete&#8221;、&#8221;channelDisconnected&#8221;、&#8221;channelInterestChanged&#8221;。</p>
<p>     Netty中有个注释@interface ChannelPipelineCoverage，它表示被注释的ChannelHandler是否能添加到多个ChannelPipeline中，其可选的值是&#8221;all&#8221;和&#8221;one&#8221;。&#8221;all&#8221;表示ChannelHandler是无状态的，可被多个ChannelPipeline共享，而&#8221;one&#8221;表示ChannelHandler只作用于单个ChannelPipeline中。但ChannelPipelineCoverage只是个注释而已，并没有实际的检查作用。对于ChannelHandler是&#8221;all&#8221;还是&#8221;one&#8221;，还是根据逻辑需要而定。比如，像解码请求handler，因为可能解码的数据不完整，需要等待下一次读事件来了之后再继续解析，所以解码请求handler就需要是&#8221;one&#8221;的（否则多个Channel共享数据就乱了）。而像业务逻辑处理hanlder通常是&#8221;all&#8221;的。</p>
<p>     下面以一个简单的例子说明如何编写“解码请求数据、业务逻辑处理、编码响应”这一过程中涉及的ChannelHandler。该例子实现的协议格式很简单，请求和响应流中头4个字节表示后面跟的内容长度，根据该长度可得到内容体。</p>
<p>    首先看下解码器的实现：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> MessageDecoder <span style="color: #000000; font-weight: bold;">extends</span> FrameDecoder <span style="color: #009900;">&#123;</span>
&nbsp;
    @Override
    <span style="color: #000000; font-weight: bold;">protected</span> <span style="color: #003399;">Object</span> decode<span style="color: #009900;">&#40;</span>
            ChannelHandlerContext ctx, Channel channel, ChannelBuffer buffer<span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">throws</span> <span style="color: #003399;">Exception</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>buffer.<span style="color: #006633;">readableBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&lt;</span> <span style="color: #cc66cc;">4</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//(1)</span>
        <span style="color: #009900;">&#125;</span>
        <span style="color: #000066; font-weight: bold;">int</span> dataLength <span style="color: #339933;">=</span> buffer.<span style="color: #006633;">getInt</span><span style="color: #009900;">&#40;</span>buffer.<span style="color: #006633;">readerIndex</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>buffer.<span style="color: #006633;">readableBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">&lt;</span> dataLength <span style="color: #339933;">+</span> <span style="color: #cc66cc;">4</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000066; font-weight: bold;">null</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//(2)</span>
        <span style="color: #009900;">&#125;</span>
&nbsp;
        buffer.<span style="color: #006633;">skipBytes</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">4</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//(3)</span>
        <span style="color: #000066; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> decoded <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #000066; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span>dataLength<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
        buffer.<span style="color: #006633;">readBytes</span><span style="color: #009900;">&#40;</span>decoded<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #003399;">String</span> msg <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">String</span><span style="color: #009900;">&#40;</span>decoded<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//(4)</span>
        <span style="color: #000000; font-weight: bold;">return</span> msg<span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>    MessageDecoder继承自FrameDecoder，FrameDecoder是Netty codec包中的辅助类，它是个ChannelUpstreamHandler，decode方法是FrameDecoder子类需要实现的。在上面的代码中，有：</p>
<p>    (1)检查ChannelBuffer中的字节数，如果ChannelBuffer可读的字节数少于4,则返回null等待下次读事件。<br />
    (2)继续检查ChannelBuffer中的字节数，如果ChannelBuffer可读的字节数少于dataLength + 4，则返回null等待下次读事件。<br />
    (3)越过dataLength的字节。<br />
    (4)构造解码的字符串返回。</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">@ChannelPipelineCoverage<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;all&quot;</span><span style="color: #009900;">&#41;</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> MessageServerHandler <span style="color: #000000; font-weight: bold;">extends</span> SimpleChannelUpstreamHandler <span style="color: #009900;">&#123;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000000; font-weight: bold;">final</span> Logger logger <span style="color: #339933;">=</span> Logger.<span style="color: #006633;">getLogger</span><span style="color: #009900;">&#40;</span>
            MessageServerHandler.<span style="color: #000000; font-weight: bold;">class</span>.<span style="color: #006633;">getName</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    @Override
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> messageReceived<span style="color: #009900;">&#40;</span>
            ChannelHandlerContext ctx, MessageEvent e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #339933;">!</span><span style="color: #009900;">&#40;</span>e.<span style="color: #006633;">getMessage</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">instanceof</span> <span style="color: #003399;">String</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #000000; font-weight: bold;">return</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//(1)</span>
        <span style="color: #009900;">&#125;</span>
        <span style="color: #003399;">String</span> msg <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span><span style="color: #003399;">String</span><span style="color: #009900;">&#41;</span> e.<span style="color: #006633;">getMessage</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #003399;">System</span>.<span style="color: #006633;">err</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;got msg:&quot;</span><span style="color: #339933;">+</span>msg<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        e.<span style="color: #006633;">getChannel</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">write</span><span style="color: #009900;">&#40;</span>msg<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//(2)</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    @Override
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> exceptionCaught<span style="color: #009900;">&#40;</span>
            ChannelHandlerContext ctx, ExceptionEvent e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        logger.<span style="color: #006633;">log</span><span style="color: #009900;">&#40;</span>
                Level.<span style="color: #006633;">WARNING</span>,
                <span style="color: #0000ff;">&quot;Unexpected exception from downstream.&quot;</span>,
                e.<span style="color: #006633;">getCause</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        e.<span style="color: #006633;">getChannel</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">close</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>    MessageServerHandler是服务端业务处理handler，其继承自SimpleChannelUpstreamHandler，并主要实现messageReceived事件。关于该类，有如下注解：</p>
<p>    (1)该upstream事件流中，首先经过MessageDecoder，其会将decode返回的解码后的数据构造成 MessageEvent.getMessage()，所以在handler上下文关系中，MessageEvent.getMessage()并不一定都返回ChannelBuffer类型的数据。<br />
    (2)MessageServerHandler只是简单的将得到的msg再写回给客户端。e.getChannel().write(msg);操作将触发DownstreamMessageEvent事件，也就是调用下面的MessageEncoder将编码的数据返回给客户端。</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">@ChannelPipelineCoverage<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;all&quot;</span><span style="color: #009900;">&#41;</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> MessageEncoder <span style="color: #000000; font-weight: bold;">extends</span> OneToOneEncoder <span style="color: #009900;">&#123;</span>
&nbsp;
    @Override
    <span style="color: #000000; font-weight: bold;">protected</span> <span style="color: #003399;">Object</span> encode<span style="color: #009900;">&#40;</span>
            ChannelHandlerContext ctx, Channel channel, <span style="color: #003399;">Object</span> msg<span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">throws</span> <span style="color: #003399;">Exception</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #339933;">!</span><span style="color: #009900;">&#40;</span>msg <span style="color: #000000; font-weight: bold;">instanceof</span> <span style="color: #003399;">String</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #000000; font-weight: bold;">return</span> msg<span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//(1)</span>
        <span style="color: #009900;">&#125;</span>
&nbsp;
        <span style="color: #003399;">String</span> res <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span><span style="color: #003399;">String</span><span style="color: #009900;">&#41;</span>msg<span style="color: #339933;">;</span>
        <span style="color: #000066; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> data <span style="color: #339933;">=</span> res.<span style="color: #006633;">getBytes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000066; font-weight: bold;">int</span> dataLength <span style="color: #339933;">=</span> data.<span style="color: #006633;">length</span><span style="color: #339933;">;</span>
        ChannelBuffer buf <span style="color: #339933;">=</span> ChannelBuffers.<span style="color: #006633;">dynamicBuffer</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//(2)</span>
        buf.<span style="color: #006633;">writeInt</span><span style="color: #009900;">&#40;</span>dataLength<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        buf.<span style="color: #006633;">writeBytes</span><span style="color: #009900;">&#40;</span>data<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">return</span> buf<span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//(3)</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>    MessageEncoder是个ChannelDownstreamHandler。对该类的注解如下：</p>
<p>    (1)如果编码的msg不是合法类型，就直接返回该msg，之后OneToOneEncoder会调用 ctx.sendDownstream(evt);来调用下一个ChannelDownstreamHandler。对于该例子来说，这种情况是不应该出现的。<br />
    (2)开发者创建ChannelBuffer的用武之地就是这儿了，通常使用dynamicBuffer即可，表示得到的ChannelBuffer可动态增加大小。<br />
    (3)返回编码后的ChannelBuffer之后，OneToOneEncoder会调用Channels.write将数据写回客户端。</p>
<h3>2.2、MessageServerPipelineFactory</h3>
<p>    创建了3个ChannelHandler，需要将他们注册到ChannelPipeline，而ChannelPipeline又是和Channel对应的（是全局单例还是每个Channel对应一个ChannelPipeline实例依赖于实现）。可以实现ChannelPipeline的工厂接口 ChannelPipelineFactory实现该目的。MessageServerPipelineFactory的代码如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> MessageServerPipelineFactory <span style="color: #000000; font-weight: bold;">implements</span>
        ChannelPipelineFactory <span style="color: #009900;">&#123;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> ChannelPipeline getPipeline<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">throws</span> <span style="color: #003399;">Exception</span> <span style="color: #009900;">&#123;</span>
        ChannelPipeline pipeline <span style="color: #339933;">=</span> pipeline<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
        pipeline.<span style="color: #006633;">addLast</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;decoder&quot;</span>, <span style="color: #000000; font-weight: bold;">new</span> MessageDecoder<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        pipeline.<span style="color: #006633;">addLast</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;encoder&quot;</span>, <span style="color: #000000; font-weight: bold;">new</span> MessageEncoder<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        pipeline.<span style="color: #006633;">addLast</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;handler&quot;</span>, <span style="color: #000000; font-weight: bold;">new</span> MessageServerHandler<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
        <span style="color: #000000; font-weight: bold;">return</span> pipeline<span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<h3>2.3、MessageServer</h3>
<p>    服务端程序就剩下启动代码了，使用Netty的ServerBootstrap三下五除二完成之。</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> MessageServer <span style="color: #009900;">&#123;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> main<span style="color: #009900;">&#40;</span><span style="color: #003399;">String</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> args<span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">throws</span> <span style="color: #003399;">Exception</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #666666; font-style: italic;">// Configure the server.</span>
        ServerBootstrap bootstrap <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ServerBootstrap<span style="color: #009900;">&#40;</span>
                <span style="color: #000000; font-weight: bold;">new</span> NioServerSocketChannelFactory<span style="color: #009900;">&#40;</span>
                        Executors.<span style="color: #006633;">newCachedThreadPool</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>,
                        Executors.<span style="color: #006633;">newCachedThreadPool</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
        <span style="color: #666666; font-style: italic;">// Set up the default event pipeline.</span>
        bootstrap.<span style="color: #006633;">setPipelineFactory</span><span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> MessageServerPipelineFactory<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
        <span style="color: #666666; font-style: italic;">// Bind and start to accept incoming connections.</span>
        bootstrap.<span style="color: #006633;">bind</span><span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> InetSocketAddress<span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">8080</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>    稍加补充的是，该Server程序并不完整，它没有处理关闭时的资源释放，尽管暴力的来看并不一定需要做这样的善后工作。</p>
<h2>3、客户端程序</h2>
<p>    客户端程序和服务端程序处理模型上是很相似的，这里还是付上代码并作简要说明。</p>
<h3>3.1、 ChannelHandler</h3>
<p>    客户端是先发送数据到服务端（downstream事件流），然后是处理从服务端接收的数据（upstream事件流）。这里有个问题是，怎么把需要发送的数据送到downstream事件流里呢？这就用到了ChannelUpstreamHandler的channelConnected事件了。实现的 MessageClientHandler代码如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">@ChannelPipelineCoverage<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;all&quot;</span><span style="color: #009900;">&#41;</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> MessageClientHandler <span style="color: #000000; font-weight: bold;">extends</span> SimpleChannelUpstreamHandler <span style="color: #009900;">&#123;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000000; font-weight: bold;">final</span> Logger logger <span style="color: #339933;">=</span> Logger.<span style="color: #006633;">getLogger</span><span style="color: #009900;">&#40;</span>
            MessageClientHandler.<span style="color: #000000; font-weight: bold;">class</span>.<span style="color: #006633;">getName</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
&nbsp;
    @Override
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> channelConnected<span style="color: #009900;">&#40;</span>
            ChannelHandlerContext ctx, ChannelStateEvent e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #003399;">String</span> message <span style="color: #339933;">=</span> <span style="color: #0000ff;">&quot;hello kafka0102&quot;</span><span style="color: #339933;">;</span>
        e.<span style="color: #006633;">getChannel</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">write</span><span style="color: #009900;">&#40;</span>message<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    @Override
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> messageReceived<span style="color: #009900;">&#40;</span>
            ChannelHandlerContext ctx, MessageEvent e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #666666; font-style: italic;">// Send back the received message to the remote peer.</span>
        <span style="color: #003399;">System</span>.<span style="color: #006633;">err</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;messageReceived send message &quot;</span><span style="color: #339933;">+</span>e.<span style="color: #006633;">getMessage</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #003399;">Thread</span>.<span style="color: #006633;">sleep</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">1000</span><span style="color: #339933;">*</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #003399;">Exception</span> ex<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            ex.<span style="color: #006633;">printStackTrace</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
        e.<span style="color: #006633;">getChannel</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">write</span><span style="color: #009900;">&#40;</span>e.<span style="color: #006633;">getMessage</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    @Override
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> exceptionCaught<span style="color: #009900;">&#40;</span>
            ChannelHandlerContext ctx, ExceptionEvent e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #666666; font-style: italic;">// Close the connection when an exception is raised.</span>
        logger.<span style="color: #006633;">log</span><span style="color: #009900;">&#40;</span>
                Level.<span style="color: #006633;">WARNING</span>,
                <span style="color: #0000ff;">&quot;Unexpected exception from downstream.&quot;</span>,
                e.<span style="color: #006633;">getCause</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        e.<span style="color: #006633;">getChannel</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">close</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>    对于编码和解码Handler，复用MessageEncoder和MessageDecoder即可。</p>
<h3>3.2、 MessageClientPipelineFactory</h3>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> MessageClientPipelineFactory <span style="color: #000000; font-weight: bold;">implements</span>
        ChannelPipelineFactory <span style="color: #009900;">&#123;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> ChannelPipeline getPipeline<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">throws</span> <span style="color: #003399;">Exception</span> <span style="color: #009900;">&#123;</span>
        ChannelPipeline pipeline <span style="color: #339933;">=</span> pipeline<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
        pipeline.<span style="color: #006633;">addLast</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;decoder&quot;</span>, <span style="color: #000000; font-weight: bold;">new</span> MessageDecoder<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        pipeline.<span style="color: #006633;">addLast</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;encoder&quot;</span>, <span style="color: #000000; font-weight: bold;">new</span> MessageEncoder<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        pipeline.<span style="color: #006633;">addLast</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;handler&quot;</span>, <span style="color: #000000; font-weight: bold;">new</span> MessageClientHandler<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
        <span style="color: #000000; font-weight: bold;">return</span> pipeline<span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<h3>3.3、MessageClient</h3>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> MessageClient <span style="color: #009900;">&#123;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> main<span style="color: #009900;">&#40;</span><span style="color: #003399;">String</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> args<span style="color: #009900;">&#41;</span> <span style="color: #000000; font-weight: bold;">throws</span> <span style="color: #003399;">Exception</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #666666; font-style: italic;">// Parse options.</span>
        <span style="color: #003399;">String</span> host <span style="color: #339933;">=</span> <span style="color: #0000ff;">&quot;127.0.0.1&quot;</span><span style="color: #339933;">;</span>
        <span style="color: #000066; font-weight: bold;">int</span> port <span style="color: #339933;">=</span> <span style="color: #cc66cc;">8080</span><span style="color: #339933;">;</span>
        <span style="color: #666666; font-style: italic;">// Configure the client.</span>
        ClientBootstrap bootstrap <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ClientBootstrap<span style="color: #009900;">&#40;</span>
                <span style="color: #000000; font-weight: bold;">new</span> NioClientSocketChannelFactory<span style="color: #009900;">&#40;</span>
                        Executors.<span style="color: #006633;">newCachedThreadPool</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>,
                        Executors.<span style="color: #006633;">newCachedThreadPool</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #666666; font-style: italic;">// Set up the event pipeline factory.</span>
        bootstrap.<span style="color: #006633;">setPipelineFactory</span><span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> MessageClientPipelineFactory<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #666666; font-style: italic;">// Start the connection attempt.</span>
        ChannelFuture future <span style="color: #339933;">=</span> bootstrap.<span style="color: #006633;">connect</span><span style="color: #009900;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> InetSocketAddress<span style="color: #009900;">&#40;</span>host, port<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #666666; font-style: italic;">// Wait until the connection is closed or the connection attempt fails.</span>
        future.<span style="color: #006633;">getChannel</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">getCloseFuture</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">awaitUninterruptibly</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #666666; font-style: italic;">// Shut down thread pools to exit.</span>
        bootstrap.<span style="color: #006633;">releaseExternalResources</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>    在写客户端例子时，我想像的代码并不是这样的，对客户端的代码我也没做过多的研究，所以也可能没有找到更好的解决方案。在上面的例子中，bootstrap.connect方法中会触发实际的连接操作，接着触发 MessageClientHandler.channelConnected，使整个过程运转起来。但是，我想要的是一个连接池，并且如何写数据也不应该在channelConnected中，这样对于动态的数据，只能在构造函数中传递需要写的数据了。但到现在，我还不清楚如何将连接池和 ChannelPipeline有效的结合起来。或许，这样的需求可以跨过Netty来实现。</p>
<h2>4、总结</h2>
<p>    关于Netty的初步使用，尚且总结到这里。关于这篇文章，写得断断续续，以至于到后来我都没兴趣把内容都整理出来。当然，这多少也是因为我是先整理 Netty原理方面的东西所致。我也只能卑微的期望，该文对Netty入门者会有些许帮助。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/06/161.html/feed</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>在Java中使用脚本语言</title>
		<link>http://www.kafka0102.com/2010/06/155.html</link>
		<comments>http://www.kafka0102.com/2010/06/155.html#comments</comments>
		<pubDate>Sun, 06 Jun 2010 07:45:57 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[ScriptEngine]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=155</guid>
		<description><![CDATA[    有一段时间没有搞Java，对于Java5以来一些新特性了解也不多，这几天看Solr的DIH，发现个很不错的配置支持--脚本引擎。以前提Java，是一处编译到处运行，现在可以说是一个平台多种语言。借此机会，整理了下Java6中引入的脚本引擎的相关特点和功能]]></description>
			<content:encoded><![CDATA[<p>    有一段时间没有搞Java，对于Java5以来一些新特性了解也不多，这几天看Solr的DIH，发现个很不错的配置支持&#8211;脚本引擎。以前提Java，是一处编译到处运行，现在可以说是一个平台多种语言。借此机会，整理了下Java6中引入的脚本引擎的相关特点和功能。</p>
<h2>1、可用的脚本引擎</h2>
<p>    Java6提供对执行脚本语言的支持，这个支持来自于JSR223规范，对应的包是javax.script。默认情况下，Java6只支持 JavaScript脚本，它底层的实现是Mozilla Rhino（Rhino意为犀牛，是不是想起那本JavaScript大部头的封面了？），它是个纯Java的JavaScript实现。可以通过下面的代码列出当前环境中支持的脚本引擎：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">        ScriptEngineManager manager <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ScriptEngineManager<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        List<span style="color: #339933;">&lt;</span>ScriptEngineFactory<span style="color: #339933;">&gt;</span> factories <span style="color: #339933;">=</span> manager.<span style="color: #006633;">getEngineFactories</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span>ScriptEngineFactory f <span style="color: #339933;">:</span> factories<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>
                    <span style="color: #0000ff;">&quot;egine name:&quot;</span><span style="color: #339933;">+</span>f.<span style="color: #006633;">getEngineName</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">+</span>
                    <span style="color: #0000ff;">&quot;,engine version:&quot;</span><span style="color: #339933;">+</span>f.<span style="color: #006633;">getEngineVersion</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">+</span>
                    <span style="color: #0000ff;">&quot;,language name:&quot;</span><span style="color: #339933;">+</span>f.<span style="color: #006633;">getLanguageName</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">+</span>
                    <span style="color: #0000ff;">&quot;,language version:&quot;</span><span style="color: #339933;">+</span>f.<span style="color: #006633;">getLanguageVersion</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">+</span>
                    <span style="color: #0000ff;">&quot;,names:&quot;</span><span style="color: #339933;">+</span>f.<span style="color: #006633;">getNames</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">+</span>
                    <span style="color: #0000ff;">&quot;,mime:&quot;</span><span style="color: #339933;">+</span>f.<span style="color: #006633;">getMimeTypes</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">+</span>
                    <span style="color: #0000ff;">&quot;,extension:&quot;</span><span style="color: #339933;">+</span>f.<span style="color: #006633;">getExtensions</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span></pre></div></div>

<p>    在我的机器上的输出是：egine name:Mozilla Rhino,engine version:1.6 release 2,language name:ECMAScript,language version:1.6,names:[js, rhino, JavaScript, javascript, ECMAScript, ecmascript],mime:[application/javascript, application/ecmascript, text/javascript, text/ecmascript],extension:[js]。可以看到，Java内置只支持JavaScript一种脚本。但是，只要遵循 JSR223，便可以扩展支持多种脚本语言，可以从https://scripting.dev.java.net/上查找当前已被支持的脚本的第三方库。</p>
<h2>2、hello script</h2>
<p>    接下来给出在Java中使用JavaScript的Hello world示例：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">        ScriptEngineManager manager <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ScriptEngineManager <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        ScriptEngine engine <span style="color: #339933;">=</span> manager.<span style="color: #006633;">getEngineByName</span> <span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;js&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #003399;">String</span> script <span style="color: #339933;">=</span> <span style="color: #0000ff;">&quot;print ('hello script')&quot;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
            engine.<span style="color: #006633;">eval</span> <span style="color: #009900;">&#40;</span>script<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span>ScriptException e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            e.<span style="color: #006633;">printStackTrace</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span></pre></div></div>

<p>    使用的API还是很简单的，ScriptEngineManager是ScriptEngine的工厂，实例化该工厂的时候会加载可用的所有脚本引擎。从工厂中创建ScriptEngine可以使用getEngineByName、getEngineByExtension或 getEngineByMimeType来得到，只要参数名字能对上。执行脚本调用eval方法即可（效果等同于JavaScript中的eval）。</p>
<h2>3、传递变量</h2>
<p>    可以向脚本中传递变量，使得Java代码可以和脚本代码交互，示例如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">        ScriptEngineManager manager <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ScriptEngineManager<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        ScriptEngine engine <span style="color: #339933;">=</span> manager.<span style="color: #006633;">getEngineByName</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;js&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        engine.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;a&quot;</span>, <span style="color: #cc66cc;">4</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        engine.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;b&quot;</span>, <span style="color: #cc66cc;">6</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #003399;">Object</span> maxNum <span style="color: #339933;">=</span> engine.<span style="color: #006633;">eval</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;function max_num(a,b){return (a&gt;b)?a:b;}max_num(a,b);&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;max_num:&quot;</span> <span style="color: #339933;">+</span> maxNum<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            e.<span style="color: #006633;">printStackTrace</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span></pre></div></div>

<p>        输出内容：max_num:6</p>
<p>    对于上面put的变量，它作用于自身engine范围内，也就是ScriptContext.ENGINE_SCOPE，put 的变量放到一个叫Bindings的Map中，可以通过 engine.getBindings(ScriptContext.ENGINE_SCOPE).get(&#8220;a&#8221;);得到put的内容。和ENGINE_SCOPE相对，还有个ScriptContext.GLOBAL_SCOPE 作用域，其作用的变量是由同一ScriptEngineFactory创建的所有ScriptEngine共享的全局作用域。</p>
<h2>4、动态调用</h2>
<p>    上面的例子中定义了一个JavaScript函数max_num，可以通过Invocable接口来多次调用脚本库中的函数，Invocable接口是 ScriptEngine可选实现的接口。下面是个使用示例：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">        ScriptEngineManager manager <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ScriptEngineManager<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        ScriptEngine engine <span style="color: #339933;">=</span> manager.<span style="color: #006633;">getEngineByName</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;js&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
            engine.<span style="color: #006633;">eval</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;function max_num(a,b){return (a&gt;b)?a:b;}&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            Invocable invoke <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>Invocable<span style="color: #009900;">&#41;</span> engine<span style="color: #339933;">;</span>
            <span style="color: #003399;">Object</span> maxNum <span style="color: #339933;">=</span> invoke.<span style="color: #006633;">invokeFunction</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;max_num&quot;</span>,<span style="color: #cc66cc;">4</span>,<span style="color: #cc66cc;">6</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>maxNum<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            maxNum <span style="color: #339933;">=</span> invoke.<span style="color: #006633;">invokeFunction</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;max_num&quot;</span>, <span style="color: #cc66cc;">7</span>,<span style="color: #cc66cc;">6</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>maxNum<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #666666; font-style: italic;">// TODO: handle exception</span>
        <span style="color: #009900;">&#125;</span></pre></div></div>

<p>    上面的invokeFunction，第一个参数调用的脚本函数名，后面跟的可变参数是对应的脚本函数参数。</p>
<p>    Invocable还有个很酷的功能，就是动态实现接口，它可以从脚本引擎中得到Java Interface 的实例；也就是说，可以定义个一个Java接口，其实现是由脚本完成。以上面的例子为例，定义接口JSLib，该接口中的函数和JavaScript中的函数签名保持一致：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">interface</span> JSLib <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">int</span> max_num<span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">int</span> a,<span style="color: #000066; font-weight: bold;">int</span> b<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span></pre></div></div>

<p>    调用示例：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">        ScriptEngineManager manager <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ScriptEngineManager<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        ScriptEngine engine <span style="color: #339933;">=</span> manager.<span style="color: #006633;">getEngineByName</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;js&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
            engine.<span style="color: #006633;">eval</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;function max_num(a,b){return (a&gt;b)?a:b;}&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            Invocable invoke <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>Invocable<span style="color: #009900;">&#41;</span> engine<span style="color: #339933;">;</span>
            JSLib jslib <span style="color: #339933;">=</span> invoke.<span style="color: #006633;">getInterface</span><span style="color: #009900;">&#40;</span>JSLib.<span style="color: #000000; font-weight: bold;">class</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #000066; font-weight: bold;">int</span> maxNum <span style="color: #339933;">=</span> jslib.<span style="color: #006633;">max_num</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">4</span>,<span style="color: #cc66cc;">6</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>maxNum<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #666666; font-style: italic;">// TODO: handle exception</span>
        <span style="color: #009900;">&#125;</span></pre></div></div>

<h2>5、使用 Java 对象</h2>
<p>    可以在JavaScript中使用Java代码，这确实是很酷的事情。在Rhino中，可以通过importClass导入一个类，也可以通过importPackage导入一个包，也可以直接使用全路经的类。在创建对象时，new也不是必须的。示例代码如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">        ScriptEngineManager manager <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ScriptEngineManager<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        ScriptEngine engine <span style="color: #339933;">=</span> manager.<span style="color: #006633;">getEngineByName</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;js&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #003399;">String</span> script <span style="color: #339933;">=</span> <span style="color: #0000ff;">&quot;var list = java.util.ArrayList();list.add(<span style="color: #000099; font-weight: bold;">\&quot;</span>kafka0102<span style="color: #000099; font-weight: bold;">\&quot;</span>);print(list.get(0));&quot;</span><span style="color: #339933;">;</span>
            engine.<span style="color: #006633;">eval</span><span style="color: #009900;">&#40;</span>script<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            e.<span style="color: #006633;">printStackTrace</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span></pre></div></div>

<h2>6、编译执行</h2>
<p>    脚本引擎默认是解释执行的，如果需要反复执行脚本，可以使用它的可选接口Compilable来编译执行脚本，以获得更好的性能，示例代码如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">        ScriptEngineManager manager <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ScriptEngineManager<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        ScriptEngine engine <span style="color: #339933;">=</span> manager.<span style="color: #006633;">getEngineByName</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;js&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
            Compilable compEngine <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>Compilable<span style="color: #009900;">&#41;</span> engine<span style="color: #339933;">;</span>
            CompiledScript script <span style="color: #339933;">=</span> compEngine.<span style="color: #006633;">compile</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;function max_num(a,b){return (a&gt;b)?a:b;}&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            script.<span style="color: #006633;">eval</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            Invocable invoke <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>Invocable<span style="color: #009900;">&#41;</span> engine<span style="color: #339933;">;</span>
            <span style="color: #003399;">Object</span> maxNum <span style="color: #339933;">=</span> invoke.<span style="color: #006633;">invokeFunction</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;max_num&quot;</span>,<span style="color: #cc66cc;">4</span>,<span style="color: #cc66cc;">6</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>maxNum<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #003399;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            e.<span style="color: #006633;">printStackTrace</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span></pre></div></div>

<h2>7、总结</h2>
<p>    除了上面提到的特性，脚本引擎还有一些不错的功能，比如可以执行脚本文件，可以由多线程异步执行脚本等功能。引入脚本引擎，可以对一些配置扩展和业务规则做更强大而灵活的支持，也方便使用者选择自己熟悉的脚本语言来编写业务规则等。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/06/155.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>log4j使用指南</title>
		<link>http://www.kafka0102.com/2010/05/147.html</link>
		<comments>http://www.kafka0102.com/2010/05/147.html#comments</comments>
		<pubDate>Mon, 31 May 2010 13:26:51 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[log]]></category>
		<category><![CDATA[log4j]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=147</guid>
		<description><![CDATA[log4j是Java中老牌的日志工具了，其强大的功能、简便的使用，使得开源项目中随处可见它的身影。即便jdk1.4中引入了logging功  能，log4j还是最受欢迎的日志工具。对log4j的使用者来说，使用log4j的API就那个几个打印日志函数，最需要关注的就是它的配置文件。不 过，很多人只是从网上找个配置样例把它跑起来，而没有更有效的使用log4j处理日志。这其实也不仅仅关乎log4j的使用，而是实际的如何有效的利用工 具来记录日志、分析日志和监控日志。]]></description>
			<content:encoded><![CDATA[<h2>1.Introduction</h2>
<p>log4j是Java中老牌的日志工具了，其强大的功能、简便的使用，使得开源项目中随处可见它的身影。即便jdk1.4中引入了logging功  能，log4j还是最受欢迎的日志工具。对log4j的使用者来说，使用log4j的API就那个几个打印日志函数，最需要关注的就是它的配置文件。不 过，很多人只是从网上找个配置样例把它跑起来，而没有更有效的使用log4j处理日志。这其实也不仅仅关乎log4j的使用，而是实际的如何有效的利用工 具来记录日志、分析日志和监控日志。</p>
<p>log4j核心的概念有logger、appender、layout和filter，下面将分别做介绍。对于这些概念，既可以通过配置文件体现出来， 也可以通过它的API体现处理。在使用上，关注配置文件的细节即可，而不需要关注log4j自身的API及实现方面的事情。尽管抛开配置文件，也可以使用 API来操纵配置，甚至可以扩展它，但log4j提供的功能已经很强大了，通常也不需要使用者做二次开发。为了整理出该文，我也是对log4j的实现做了 算不上深入的浏览，本文的内容主要参考log4j的参考手册及相关文章。对于log4j的配置，log4j支持java  properties文件和xml文件，本文在阐述相关配置内容采用了xml格式，因为Filter功能properties文件不能支持。</p>
<h2>2.Loggers</h2>
<p>log4j的Logger类提供的功能如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">package</span> <span style="color: #006699;">org.apache.log4j</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> Logger <span style="color: #009900;">&#123;</span>
<span style="color: #666666; font-style: italic;">// Creation &amp; retrieval methods:</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> Logger getRootLogger<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> Logger getLogger<span style="color: #009900;">&#40;</span><span style="color: #003399;">String</span> name<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #666666; font-style: italic;">// printing methods:</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> trace<span style="color: #009900;">&#40;</span><span style="color: #003399;">Object</span> message<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> debug<span style="color: #009900;">&#40;</span><span style="color: #003399;">Object</span> message<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> info<span style="color: #009900;">&#40;</span><span style="color: #003399;">Object</span> message<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> warn<span style="color: #009900;">&#40;</span><span style="color: #003399;">Object</span> message<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> error<span style="color: #009900;">&#40;</span><span style="color: #003399;">Object</span> message<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> fatal<span style="color: #009900;">&#40;</span><span style="color: #003399;">Object</span> message<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #666666; font-style: italic;">// generic printing method:</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000066; font-weight: bold;">void</span> log<span style="color: #009900;">&#40;</span>Level l, <span style="color: #003399;">Object</span> message<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>使用上，通常是在类中通过private static Logger logger =  Logger.getLogger(package.classname); 声明静态logger成员，打日志就是调用各level函数。 getLogger的参数是Logger的标识，并具有层次关系，比如“com.foo”是“com.foo.Bar”的父Logger。logger的 xml配置格式是：</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;">&lt;!ELEMENT logger <span style="color: #66cc66;">&#40;</span>level?,appender-ref*<span style="color: #66cc66;">&#41;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;">&lt;!ATTLIST logger</span>
<span style="color: #009900;">name ID #REQUIRED</span>
<span style="color: #009900;">additivity <span style="color: #66cc66;">&#40;</span>true|false<span style="color: #66cc66;">&#41;</span> <span style="color: #ff0000;">&quot;true&quot;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&gt;</span></span></pre></div></div>

<p>其中，子元素level表示输出的最低级别（默认是debug），appender-ref则引用配置中的appender（可以是多个）；其属性 name是标识，additivity表示在层级关系中，是否向上查找，比如A是B的父logger，A的level是info，B没有指定level，当B的additivity为true，在B打日志时，发现B没有指定level，就向上查找到A并使用A的level，否则就屏蔽掉B的输出。</p>
<p>在logger层级中，最顶层的是root logger，可以通过getRootLogger()得到（尽管很少有人会这么做）。常见的配置中也就是配置root logger，那么在各个类中创建的logger会直接继承root logger的配置。</p>
<p>有时也可能需要对特定的logger做处理，比如我的模块中用到memcached client库，因为模块的level是debug，这使得memcached client库中的debug信息都会打出来，而我真的不是很关心它，所以就通过下面的配置关掉它：</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;logger</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;com.danga&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;level</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;info&quot;</span><span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/logger<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>对于日志level，log4j支持通过继承Level类自定义Level，这在一些情景下或许会有帮助。比如，可以添加一个Level来表示和统计相关的日志。另外，像上面提到的例子，logger的level是可继承的，当子logger没有指定level时，它会使用其父logger的，并一直检查到root logger。</p>
<h2>3.Appenders</h2>
<p>appender表示要把日志输出到哪里去。在 log4j.dtd中，appender声明的格式如下：</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;">&lt;!ELEMENT appender <span style="color: #66cc66;">&#40;</span>errorHandler?, param*, layout?, filter*, appender-ref*<span style="color: #66cc66;">&#41;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;">&lt;!ATTLIST appender</span>
<span style="color: #009900;">name ID #REQUIRED</span>
<span style="color: #009900;">class CDATA #REQUIRED</span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&gt;</span></span></pre></div></div>

<p>一个样例如下：</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;appender</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;console&quot;</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.apache.log4j.ConsoleAppender&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;Target&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;System.out&quot;</span><span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;layout</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.apache.log4j.PatternLayout&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;ConversionPattern&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;%-5p %c{1} - %m%n&quot;</span><span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/layout<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/appender<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>下面说明appender可能包含的子元素的含义：<br />
1）errorHandler：这是一个钩子，当appender出现异常时（比如layout无效），可以指定errorHandler来做些善后工作，一般是不需要配置它的。<br />
2）param：不同的appender有自己特定的参数选项，每一个param是key-value对，可以查看log4j API doc中的Appender实现类API说明，其中的加粗字体便是。<br />
3）layout：下面有说明。<br />
4）filter：下面有说明。<br />
5）appender-ref：appender也可以包含多个appender。</p>
<p>下面简要介绍常用的几个Appender：<br />
1、ConsoleAppender：ConsoleAppender是将日志打到控制台上，这在开发时观察日志会相比打到文件里更方便一些。它可用的 param元素只有Target，可选值是System.out和System.err，默认的是System.out，如果配成System.err，在eclipse的console会输出红色字体内容。如果想要把一个应用中的日志内容（包括非日志内容的异常信息）都输出到一个文件，也可以使用 ConsoleAppender，通过输出重定向把所有内容打到一个文件中去。</p>
<p>2、FileAppender：FileAppender就是把日志打到文件里，也是用的最多的，它可用的param元素如下：<br />
1）File：输出的文件路径。<br />
2）Append：打开日志文件的模式，默认true表示追加写，否则会清空文件已有内容。<br />
3）BufferedIO：默认为false，如果为true表示对Writer包装成 BufferedWriter，这种缓冲方式对服务端应用来说会带来性能问题。</p>
<p>3、DailyRollingFileAppender：DailyRollingFileAppender是FileAppender的升级版，它支持对日志做定期切割，这可以省去我们配置crontab定期执行脚本来切割日志，它可用的param元素如下：<br />
1）File：输出的文件路径。<br />
2）Append：打开日志文件的模式，默认true表示追加写， 否则会清空文件已有内容。<br />
3）DatePattern：DailyRollingFileAppender根据该参数来调度何时切割日志，这个日期格式与 SimpleDateFormat一致，可以做到按分时天周月等不同粒度切割日志。比如，“&#8217;.'yyyy-MM-dd”表示每天零点切割日志，假如日志文件名是foo.log，那么在2010-05-31零点执行切割后前一天的日志文件名是foo.log.2010-05-30，31号新的日志打到 foo.log。DailyRollingFileAppender日志切割的过程是：关闭打开的日志文件（foo.log）句柄，rename该日志文件（foo.log.2010-05-30），打开新创建的日志文件（foo.log）。</p>
<h2>4.Layouts</h2>
<p>layout表示日志输出的格式，log4j支持的layout有TTCCLayout, HTMLLayout, PatternLayout, SimpleLayout和XMLLayout，常用的是PatternLayout，性能最好的是SimpleLayout（因为它足够 simple）。PatternLayout支持的模式选项说明如下：</p>
<p>%m:输出日志消息内容.<br />
%p: 输出日志事件的priority（DEBUG、INFO等）.<br />
%r: 输出自程序启动后到当前的时间差，似乎用处不大。<br />
%c: 输出category名称，也就是getLogger函数的参数，用处也不大。<br />
%t: 输出当前的线程名，一些多线程环境中或许用的上。<br />
%x: 输出nested diagnostic context (NDC)，这个功能对多客户端请求的场景很有用。当使用日志查找分析问题时，很多时候希望针对某一个出问题的请求，查看它的执行流程，定位问题出在哪个环节，这就需要对一个请求的流程做唯一标识。这个唯一标识可以是全局唯一的logid，初始由最前端的模块分配，然后贯穿流程中的所有模块。也可以是其他东西，比如请求ip、请求参数等。这些信息可以通过log4j的NDC在日志中输出。NDC的结构如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> NDC <span style="color: #009900;">&#123;</span>
<span style="color: #666666; font-style: italic;">// Used when printing the diagnostic</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #003399;">String</span> get<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #666666; font-style: italic;">// Remove the top of the context from the NDC.</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #003399;">String</span> pop<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #666666; font-style: italic;">// Add diagnostic context for the current thread.</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> push<span style="color: #009900;">&#40;</span><span style="color: #003399;">String</span> message<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #666666; font-style: italic;">// Remove the diagnostic context for this thread.</span>
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000066; font-weight: bold;">void</span> remove<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>在处理请求的线程（比如servlet）中，新的请求开始处调用NDC.push方法设置标识，请求处理的最后再remove掉（也或者在push之前先remove）。</p>
<p>%n: 输出平台独立的换行符，如&#8221;\n&#8221;、&#8221;\r\n&#8221;等，通常和%m连用。<br />
WARNING：下面的参数有性能问题，对性能要求高的场景需要做好度量。<br />
%d: 输出时间，可以指定时间格式，比如 %d{HH:mm:ss,SSS} 或 %d{dd MMM yyyy HH:mm:ss,SSS}等。<br />
%C: 输出调用日志类方法者的fully-qualified类名，默认是输出全路径（也就是包名+类名），也可以限定{n}表示输出全称的最后n个部分，比如&#8221;com.foo.SomeClass&#8221;, 模式%C{1}将输出&#8221;SomeClass&#8221;。<br />
%M:输出调用日志类方法者的方法名。<br />
%F: 输出调用日志类方法者的文件名。<br />
%L: 输出调用日志类方法者的行号。<br />
%l: 输出调用日志类方法者的源代码位置，它是%C.%M(%F:%L)的简称。</p>
<p>上面的输出选项中，和调用者位置相关的选项会有性能问题。这是因为，为了得到这些信息，log4j调用 Throwable.getStackTrace()来得到整个调用过程的栈信息，自底向上比较调用的函数名，直到找到日志函数（debug等）的上一级函数名，然后通过反射得到一系列位置信息。这个过程显然要比其他几项的取得复杂的多，但它对分析日志查找问题却是很有用的。我的一个建议是，对于info 级别的日志，就不需要打出调用位置等信息，对于debug、warning和error则需要。另一个，输出时间也是很有必要的，否则做统计查问题都无从下手。</p>
<h2>5.Filter</h2>
<p>log4j中的filter可以指定appender要输出的日志等级范围，这可以实现在应用中把不同等级的日志打到不同文件中。像debug、info 级别，每天会产生很多，也多用来做统计分析；而warning和error级别的日志是需要监控处理的，并且人还有可能上去查看；所以把两者分开就显得很有必要。对于有特别需求的日志，也可以单独打到一个文件里去。下面是使用filter的一个样例：</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;appender</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;TRACE&quot;</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.apache.log4j.ConsoleAppender&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;layout</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.apache.log4j.PatternLayout&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;ConversionPattern&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;[%t] %-5p %c - %m%n&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/layout<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;filter</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.apache.log4j.varia.LevelRangeFilter&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;levelMin&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;DEBUG&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;levelMax&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;INFO&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/filter<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;filter</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.apache.log4j.varia.DenyAllFilter&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/appender<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>LevelRangeFilter可以指定某个范围（从levelMin到levelMax）的等级，在上面的配置中，如果没有 DenyAllFilter，表示从DEBUG到INFO级别的日志不做处理，而加了DenyAllFilter后含义反转，表示该appender只打印从DEBUG到INFO的日志。log4j中另一个实用的filter是LevelMatchFilter，它准确的匹配某个日志等级。</p>
<h2>6.Example</h2>
<p>下面是一个完整的log4j.xml配置文件样例：</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;?xml</span> <span style="color: #000066;">version</span>=<span style="color: #ff0000;">&quot;1.0&quot;</span> <span style="color: #000066;">encoding</span>=<span style="color: #ff0000;">&quot;UTF-8&quot;</span> <span style="color: #000000; font-weight: bold;">?&gt;</span></span>
<span style="color: #00bbdd;">&lt;!DOCTYPE log4j:configuration SYSTEM &quot;log4j.dtd&quot;&gt;</span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;log4j:configuration</span> <span style="color: #000066;">xmlns:log4j</span>=<span style="color: #ff0000;">&quot;http://jakarta.apache.org/log4j/&quot;</span></span>
<span style="color: #009900;"><span style="color: #000066;">debug</span>=<span style="color: #ff0000;">&quot;true&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
&nbsp;
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;appender</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;info-out&quot;</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.apache.log4j.DailyRollingFileAppender&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;File&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;${log_path}.log&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;DatePattern&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;'.'yyyy-MM-dd&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;layout</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.apache.log4j.PatternLayout&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;ConversionPattern&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;[%d{yyyy-MM-dd HH:mm:ss}][%p][%F(%L)]%m%n&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/layout<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;filter</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.apache.log4j.varia.LevelRangeFilter&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;LevelMin&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;debug&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;LevelMax&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;info&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;AcceptOnMatch&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;true&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/filter<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;filter</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.apache.log4j.varia.DenyAllFilter&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/appender<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
&nbsp;
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;appender</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;error-out&quot;</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.apache.log4j.DailyRollingFileAppender&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;Append&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;false&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;DatePattern&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;'.'yyyy-MM-dd&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;File&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;${log_path}.wf.log&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;layout</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.apache.log4j.PatternLayout&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;ConversionPattern&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;[%d{yyyy-MM-dd HH:mm:ss}][%p][%F(%L)]%m%n&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/layout<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;filter</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.apache.log4j.varia.LevelRangeFilter&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;LevelMin&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;warn&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;LevelMax&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;error&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;AcceptOnMatch&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;true&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/filter<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;filter</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.apache.log4j.varia.DenyAllFilter&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/appender<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
&nbsp;
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;root<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;level</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;debug&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;appender-ref</span> <span style="color: #000066;">ref</span>=<span style="color: #ff0000;">&quot;info-out&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;appender-ref</span> <span style="color: #000066;">ref</span>=<span style="color: #ff0000;">&quot;error-out&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/root<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
&nbsp;
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;logger</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;com.danga&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;level</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;info&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/logger<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/log4j:configuration<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>对于配置中的行</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;param</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;File&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;${log_path}.log&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span></pre></div></div>

<p>，log_path是java property（通过-D选项指定），log4j支持之。该配置达到的目标是：<br />
1）生成的日志文件有3个，一个是debug和info级别的日志，一个是warn和error级别的日志，还有一个是输出重定向的文件（主要是GC信息）。<br />
2）使用DailyRollingFileAppender切割日志文件。<br />
3）屏蔽了com.danga层级（memcached client库）的debug日志。</p>
<h2>7.Performance</h2>
<p>对于log4j的性能，我没有做细致的度量。抛开log4j来说，日志操作主要性能耗在输出上，所以输出的日志内容越少越好。除此之外，log4j使用上有两点需要注意：<br />
1、在生产环境中，我们通常是关掉debug级别的，但如果程序中debug函数很多，还是会带来性能问题。因为debug函数输出的就是些调试信息，所以其参数通常是多个字符串+操作构成，这种经典的构造多个临时对象的做法显然会有些性能消耗；更有甚者会调用诸如object.toString方法，而这个被覆盖的方法很可能是将对象内的诸多属性拼凑成字符串输出，对性能有高要求的场景就很不合适。在一些基础库或框架中，就可能会看到下面的代码片断来避免性能问题，其中的isDebugEnabled只是个判定操作，在logger层次不复杂的情况下，没有什么性能损失：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">&#40;</span>logger.<span style="color: #006633;">isDebugEnabled</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
logger.<span style="color: #006633;">debug</span><span style="color: #009900;">&#40;</span>......<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>2、复杂的logger层级也会带来性能问题。好的方面是，通常我们指定root logger就够了。</p>
<h2>8.Conclusions</h2>
<p>关于java应用中的日志处理，暂且说到这里。尽管log4j很好很强大，但如果你的程序是些如库或框架等基础服务，可以考虑 slf4j（http://www.slf4j.org）来代替log4j的API调用。slf4j是对现存的多种日志库的封装，对外提供了统一的接口，解决了依赖的程序间的日志不兼容的问题。</p>
<h2>9.Reference</h2>
<p>http://logging.apache.org/log4j/1.2/manual.html</p>
<p>http://wiki.apache.org/logging-log4j/Log4jXmlFormat</p>
<p>http://www.vipan.com/htdocs/log4jhelp.html</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/05/147.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>分享Sify.com的架构经验</title>
		<link>http://www.kafka0102.com/2010/05/144.html</link>
		<comments>http://www.kafka0102.com/2010/05/144.html#comments</comments>
		<pubDate>Sat, 15 May 2010 17:27:11 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[architecture]]></category>
		<category><![CDATA[Sify.com]]></category>
		<category><![CDATA[solr]]></category>
		<category><![CDATA[架构]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=144</guid>
		<description><![CDATA[今天分享的网站架构来自于Sify.com Architecture - A Portal at 3900 Requests Per Second（该标题有标题党嫌疑），对英文熟稔并不屑于我的中文简述的可以跳过该文。Sify.com是印度的一家portal网站，应该是信息集成类网站。它给出的月 pv是1.5亿次，每秒请求数是3900次（应该是针对所有服务的页面请求，包括异步的，并且是高峰的，否则就和pv对不上了）。按规模来说，算是个中等规模的网站，不过它的架构却是很值得说道的。]]></description>
			<content:encoded><![CDATA[<p>今天分享的网站架构来自于<a href="http://highscalability.com/blog/2010/5/10/sifycom-architecture-a-portal-at-3900-requests-per-second.html " target="_blank">Sify.com Architecture - A Portal at 3900 Requests Per Second</a>（该标题有标题党嫌疑），对英文熟稔并不屑于我的中文简述的可以跳过该文。Sify.com是印度的一家portal网站，应该是信息集成类网站。它给出的月pv是1.5亿次，每秒请求数是3900次（应该是针对所有服务的页面请求，包括异步的，并且是高峰的，否则就和pv对不上了）。按规模来说，算是个中等规模的网站，不过它的架构却是很值得说道的。</p>
<h2>网站架构</h2>
<p>1、为了节约机器资源并最大化利用机器资源，Sify.com也广泛采用了虚拟机。对于同一服务，Sify.com将其部署在多台机器上，使一台机器挂掉后仍有其他机器提供服务。对于冗余和负载均衡，目前Sify.com是将其做成配置手工修改，将来可能会自动化管理和扩展。对于运维来说，别说中小网站，就是一些大网站也没有做到自动化运维，很多还是收到报警后人肉解决。尽管自动化运维很酷，但从成本角度来说，大多中小网站不必追求这点，把监控做好，能在最快时间发现问题并解决问题就够了。</p>
<p>2、Sify.com的存储很特别，它没有使用通用的数据库，它的关系数据存在检索系统中，全文数据存在分布式文件系统中。这种围绕检索系统+kv文件系统的解决方案，我以前也读到过一篇文章。Sify.com的检索系统基于Solr，我最近也在看Solr，打算重做的检索系统也是基于Solr。Sify.com的查询会从检索系统检索出id，如果需要全文，再从文件系统（它用的是GFS，该GFS不是google的，而是一个开源的集群文件系统，我也不是很了解的说，感兴趣的可以访问http://sourceware.org/cluster/gfs/参观）取出全文内容。这个策略也是通常的检索系统处理策略。</p>
<p>3、Sify.com的提交操作采用异步处理模式，就是提交到ActiveMQ/Camel，然后分发到相应的服务处理。其中的Camel是开源的ESB实现，我也不甚了解。Sify.com各服务间似乎都是走HTTP协议，很有爱的表现。</p>
<h2>展望未来</h2>
<p>Sify.com提了一些愿景，我就不一一罗列，其中最吸引我的是，它打算采用Drools这个规则引擎做Cache失效处理。它提到Cache来源是Akamai和Varnish，所以失效的应该是来自CDN和本地的页面内容Cache。因为同一个URL请求可能会引起多个Cache项失效，而这种前端页面Cache失效让业务逻辑代码处理也不够友好，有个统一处理的系统在管理和操作上就方便很多。它的做法是，对于引起Cache失效的URL，将URL等信息打到日志中去，由Drools根据配置的规则来执行Cache失效策略。感觉上来说，还是很有意思的做法。</p>
<h2>经验教训</h2>
<p>Sify.com总结的经验都是和使用的软件的缺陷相关的，总结如下：</p>
<p>1、首先是ActiveMQ，把Sify.com搞得很狼狈。我目前对ActiveMQ只有初步的了解，也没什么发言权。它提到ActiveMQ的问题有两个：1）ActiveMQ耗尽socket资源的问题，尽管ActiveMQ5.0声称解决了，但Sify.com还是没看到效果，使得ActiveMQ需要不断重启，最后折中解决了，部署两台ActiveMQ，轮流切换，这解决方案真是够山寨的。2）它使用Topic订阅发布机制时有4个消费者，结果经常ActiveMQ会hang住数个小时，折中的解决是使用4个Queue代替（够囧），但运行还是不稳定，时有抛出异常或内存溢出的情况。按说ActiveMQ也够成熟了，也不知是真有问题还是Sify.com自己没用好。公司也有使用ActiveMQ，也打算有时间会对ActiveMQ和JMS做些深入的研究。</p>
<p>2、Solr的问题有3个：1）Solr有时会对请求没有响应或超时，就需要重启解决问题，这个可能我需要关注下。2）Solr对复杂查询（比如加NOT查询）处理慢，这个应该是lucene的问题了，只能尽可能规避，把复杂查询拆成多个简单查询后再合并处理。3）对实时检索的需求，目前Solr没有提供该功能，Sify.com打算采用LinkedIn的Zoie（有Solr plugin），不过因为Sify.com索引数据的主键是字母+数字组合，而Zoie的是数字的，需要Sify.com做些扩展工作。我前段时间也有专门对Zoie做了一些分析。</p>
<p>3、GFS锁问题造成GFS不能访问，显然是个很严重的臭虫，升级版本解决问题。</p>
<p>4、Sify.com使用Lighty和PHP合作的不是很愉快，说是PHP_FCGI不稳定，会使进程hang住CPU飚满，这个我没什么经验，也不发言。</p>
<h2>总结</h2>
<p>夜已深，睡意浓，到此为止。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/05/144.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>google collections介绍</title>
		<link>http://www.kafka0102.com/2010/05/139.html</link>
		<comments>http://www.kafka0102.com/2010/05/139.html#comments</comments>
		<pubDate>Sat, 15 May 2010 06:25:46 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[java]]></category>
		<category><![CDATA[BiMap]]></category>
		<category><![CDATA[google collections]]></category>
		<category><![CDATA[Immutable Collections]]></category>
		<category><![CDATA[MapMaker]]></category>
		<category><![CDATA[Multimap]]></category>
		<category><![CDATA[Multiset]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=139</guid>
		<description><![CDATA[google collections是google的工程师利用传说中的“20%时间”开发的集合库，它是对java.util的扩展，提供了很多实用的类来简化代码。google collections使用了范型，所以要求jdk1.5以上。它的作者没有像apache commons collections一样照顾老的jdk版本，一个原因是google的jdk基本都是1.5以上，另一个原因是类型转换实在是太难看了。现在的集合库版本是1.0,已经很稳定了，在功能和实现方面也是广泛参考意见（比如java.util之父Josh Bloch），所以该库的质量可想而知，将来也有可能集成到jdk中。项目地址是http://code.google.com/p/google-collections/，该文对其提供的核心类做简要的介绍。]]></description>
			<content:encoded><![CDATA[<p>google collections是google的工程师利用传说中的“20%时间”开发的集合库，它是对java.util的扩展，提供了很多实用的类来简化代码。google collections使用了范型，所以要求jdk1.5以上。它的作者没有像apache commons collections一样照顾老的jdk版本，一个原因是google的jdk基本都是1.5以上，另一个原因是类型转换实在是太难看了。现在的集合库版本是1.0,已经很稳定了，在功能和实现方面也是广泛参考意见（比如java.util之父Josh Bloch），所以该库的质量可想而知，将来也有可能集成到jdk中。项目地址是http://code.google.com/p/google-collections/，该文对其提供的核心类做简要的介绍。</p>
<h2>Immutable Collections</h2>
<p>在《effective java》的13条提到immutable class的好处及做法，有兴趣的可以参考该节。在immutable collections方面，java.util.Collections类提供了一系列unmodifiableFoo的静态方法供使用，这些unmodifiableFoo实际上是原有集合的视图包装，所以可以认为新生成的不变集合和原有集合是同一对象，只是不变集合不能调用修改操作。这种效果有时并不是真正想要的，有时需要的是生成的不变集合和原有集合是分离的，原有集合的后续操作不影响不变集合。google collections就提供了该功能，具体的就包括ImmutableList、ImmutableMap、ImmutableSet等。下面给出一个示例片断：</p>
<p>原有使用java.util.Collections的方法示例：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">&nbsp;
	List<span style="color: #339933;">&lt;</span>String<span style="color: #339933;">&gt;</span> list <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> ArrayList<span style="color: #339933;">&lt;</span>String<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	list.<span style="color: #006633;">add</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;1&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	list.<span style="color: #006633;">add</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;2&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	list.<span style="color: #006633;">add</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;3&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	List<span style="color: #339933;">&lt;</span>String<span style="color: #339933;">&gt;</span> immutableList <span style="color: #339933;">=</span> <span style="color: #003399;">Collections</span>.<span style="color: #006633;">unmodifiableList</span><span style="color: #009900;">&#40;</span>list<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>使用com.google.common.collect.ImmutableList示例：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">&nbsp;
	List<span style="color: #339933;">&lt;</span>String<span style="color: #339933;">&gt;</span> immutableList <span style="color: #339933;">=</span> ImmutableList.<span style="color: #006633;">of</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;1&quot;</span>,<span style="color: #0000ff;">&quot;2&quot;</span>,<span style="color: #0000ff;">&quot;3&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>com.google.common.collect.ImmutableFoo都是通过调用静态的of方法生成新的不变集合，该方法的参数是个可变数组。因为ImmutableFoo只提供读操作并自己维护数据，所以性能方面会比java.util中集合类有所提高。另外要说的是，不变集合并不能左右其包含的元素是否可变，所以不变集合中的元素最好也是不变的。</p>
<h2>Multiset &amp; Multimap</h2>
<p>java.util.Set是个无序且元素不重复的集合。而Multiset是个无序但添加元素可重复的集合，对添加的重复元素，以计数表示多少。Multiset的实现也是支持多种类型的（比如Hash、LinkedList等）下面是使用片断：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">&nbsp;
	Multiset<span style="color: #339933;">&lt;</span>String<span style="color: #339933;">&gt;</span> set <span style="color: #339933;">=</span> HashMultiset.<span style="color: #006633;">create</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	set.<span style="color: #006633;">add</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;kafka0102&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	set.<span style="color: #006633;">add</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;kafka0102&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>set.<span style="color: #006633;">count</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;kafka0102&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//输出2</span>
&nbsp;
	set.<span style="color: #006633;">setCount</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;kafka0102&quot;</span>, <span style="color: #cc66cc;">5</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>set.<span style="color: #006633;">count</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;kafka0102&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//输出5</span></pre></div></div>

<p>这个Multiset还是有很多应用场景的，比如统计用户访问计数，没有Multiset，就需要使用如Map来做，每次累加都需要先取出原有的计数值再加一后放回去，自然不如Multiset使用的方便。</p>
<p>Multimap也是很方便实用的集合，对于形如Multimap&lt;K,V&gt;的map，它相当于Map&lt;K,Collection&lt;V&gt;&gt;。如果实用Map来实现Multimap的功能，可想又是对Map的value进行三部曲操作。Multimap的实现也是支持多种类型的（比如Hash、LinkedList等）。使用Multimap的示例代码如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">&nbsp;
	Multimap<span style="color: #339933;">&lt;</span>String,String<span style="color: #339933;">&gt;</span> map <span style="color: #339933;">=</span> HashMultimap.<span style="color: #006633;">create</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	map.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;kafka0102&quot;</span>,<span style="color: #0000ff;">&quot;1&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	map.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;kafka0102&quot;</span>,<span style="color: #0000ff;">&quot;2&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>map.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;kafka0102&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//输出[2, 1]</span></pre></div></div>

<h2>BiMap</h2>
<p>BiMap（bidirectional map）是个双向的map。java.util.Map是个正向Map，也就是根据key查value，如果需要根据value查key，或者需要反向得到Map&lt;V,K&gt;，BiMap就是很好的选择，否则就需要两个Map来做。它的具体实现类有：EnumBiMap, EnumHashBiMap, HashBiMap, ImmutableBiMap。示例代码如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">&nbsp;
	BiMap<span style="color: #339933;">&lt;</span>String,String<span style="color: #339933;">&gt;</span> map <span style="color: #339933;">=</span> HashBiMap.<span style="color: #006633;">create</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	map.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;kafka0102&quot;</span>,<span style="color: #0000ff;">&quot;1&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>map.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;kafka0102&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>map.<span style="color: #006633;">inverse</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;1&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//输出反向数据没有提供单独的函数，而是需要调用inverse().get</span></pre></div></div>

<h2>MapMaker</h2>
<p>MapMaker是对ConcurrentMap的builder，它使得ConcurrentMap的key和value能是弱引用或软引用类型。特别的，它提供的makeComputingMap方法能根据key计算出value来，当没有对key来put value时，生成的ConcurrentMap能根据Function计算出value并和key关联上，后续的访问就不需要再次计算。代码示例如下：</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;">&nbsp;
	ConcurrentMap<span style="color: #339933;">&lt;</span>String, Integer<span style="color: #339933;">&gt;</span> map <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> MapMaker<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>
		.<span style="color: #006633;">concurrencyLevel</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">32</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">softKeys</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">weakValues</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">expiration</span><span style="color: #009900;">&#40;</span><span style="color: #cc66cc;">30</span>,
			TimeUnit.<span style="color: #006633;">MINUTES</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">makeComputingMap</span><span style="color: #009900;">&#40;</span>
&nbsp;
		<span style="color: #000000; font-weight: bold;">new</span> Function<span style="color: #339933;">&lt;</span>String, Integer<span style="color: #339933;">&gt;</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
&nbsp;
			<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">Integer</span> apply<span style="color: #009900;">&#40;</span><span style="color: #003399;">String</span> key<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
&nbsp;
				<span style="color: #000000; font-weight: bold;">return</span> <span style="color: #003399;">Integer</span>.<span style="color: #006633;">parseInt</span><span style="color: #009900;">&#40;</span>key<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
			<span style="color: #009900;">&#125;</span>
&nbsp;
		<span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>map.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;123&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//输出123</span>
&nbsp;
	map.<span style="color: #006633;">put</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;123&quot;</span>, <span style="color: #cc66cc;">124</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #003399;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>map.<span style="color: #006633;">get</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;123&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">//输出124</span></pre></div></div>

<p>google collections还提供一些实用的类，具体可参考它的API doc和http://publicobject.com/2007/09/series-recap-coding-in-small-with.html。该文虽行止于此，也很建议大家有时间研究下google collections的实现，这种基础库看起来简单但要实现的优雅、高效是需要很见功夫的。</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kafka0102.com/2010/05/139.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>实时检索系统Zoie实现分析</title>
		<link>http://www.kafka0102.com/2010/05/133.html</link>
		<comments>http://www.kafka0102.com/2010/05/133.html#comments</comments>
		<pubDate>Sun, 09 May 2010 05:28:29 +0000</pubDate>
		<dc:creator>kafka0102</dc:creator>
				<category><![CDATA[search]]></category>
		<category><![CDATA[LinkedIn zoie]]></category>
		<category><![CDATA[zoie]]></category>
		<category><![CDATA[实时检索]]></category>

		<guid isPermaLink="false">http://www.kafka0102.com/?p=133</guid>
		<description><![CDATA[Zoie是LinkedIn开源的基于lucene的实时检索系统，对于它的介绍及初步使用可参考我的上一篇文章“使用Zoie构建实时检索系统”。在初步研究并理解了Zoie的源码实现后，本文分析一下Zoie的实现。]]></description>
			<content:encoded><![CDATA[<p>Zoie是LinkedIn开源的基于lucene的实时检索系统，对于它的介绍及初步使用可参考我的上一篇文章“<a href="http://www.kafka0102.com/2010/05/%E4%BD%BF%E7%94%A8zoie%E6%9E%84%E5%BB%BA%E5%AE%9E%E6%97%B6%E6%A3%80%E7%B4%A2%E7%B3%BB%E7%BB%9F/">使用Zoie构建实时检索系统</a>”。在初步研究并理解了Zoie的源码实现后，本文分析一下Zoie的实现。</p>
<h2>实时检索的核心原理</h2>
<p>通常的检索系统中，建索引和查询是分开的，即建索引是离线的，新的索引会以一定频率（比如每隔5分钟）供查询端使用。对于一些站内检索来说，这种延迟性使得：不需要建索引的速度足够快（只要能跟的上提交频率就行），查询的效果不必完全精确。而要取得实时检索效果，典型的思路是：建索引和查询是在一个进程内，这样每一次的添加索引都会被下一次的查询用到，但这里面的细节还是需要好好琢磨解决的，下面就给出Zoie的基于Lucene的解决方案：索引分两种，ram index和disk index。建索引的过程是：首先建立ram index，因为是内存操作，这个过程通常较快，建完后会重新打开IndexReader，使查询端能看到最新的索引；当内存中的索引文档数达到阈值（10000）或者间隔时间达到阈值（自定义），一个后台线程就将ram index合并到disk index里去，完成后清空已经无用的ram index，并重新打开disk index的IndexReader供查询使用（这里面有个autowarm IndexReader的过程）。特别指出的是，Zoie的ram index有两个，这使得当一个ram index在和disk index做合并操作时（这个过程可能会很耗时），另一个ram index仍能提供建索引的操作。对于查询，使用的索引就包括两个ram index和一个disk index，所以只要索引在内存里建好，就能查询到最新的数据。</p>
<h2>实现概览</h2>
<p>下面简要说明Zoie的核心接口和类。</p>
<p>ZoieSystem：这个类是对外的核心类，它提供了诸多方法供外界使用，但它本身就像个Facade，封装了其成员的一系列方法。</p>
<p>DataConsumer：顾名思义，这个接口是用来消费数据也就是建索引的。实时建索引时，ZoieSystem默认使用的DataConsumer是RealtimeIndexDataLoader。在consume数据时，RealtimeIndexDataLoader主要是将数据转换成内部结构后交给另一个DataConsumer即RAMLuceneIndexDataLoader真正在内存里建索引，之后如果当前处理的索引数达到阈值，RealtimeIndexDataLoader会notify LoaderThread，而LoaderThread会调用DiskLuceneIndexDataLoader来合并索引。</p>
<p>DiskSearchIndex和RAMSearchIndex：这两个类是Zoie操作索引结构的，比如获取或打开指定目录的IndexReader、IndexWriter，更新索引写盘等操作。</p>
<p>DataProvider：这个结构表示数据提供者。查看Zoie代码，发现如果在索引的过程中程序挂掉，内存中的索引就有可能丢失，解决这个问题的方法可以是，在DataProvider端做控制，最直接的，当重启程序时，重放之前一段时间的数据即可（因为Zoie能做到定期刷数据，所以可计算出需要回放的时间点）。</p>
<h2>建索引的过程</h2>
<p>上面已经对建索引过程做了一些说明，下面配上Zoie wiki上的图再形象化些。分析它的实现时，有个RAM需要重点关注，它包含了两个RAMSearchIndex（Ram A和Ram B）和一个DiskSearchIndex对象成员，并且Ram A和Ram B也同时扮演Ram writable和Ram readable，建索引时用的是Ram writable，查询时用的是Ram readable。通过下面的图可以看到，Ram A和Ram B有个交换和清空的过程：1）RAM交换发生在Ram A要合并到Disk Index前，把A的数据挪到Ram B，使新的Ram A开始接收处理客户端建索引请求，而Ram B不再接收数据而专心合并索引。2）在合并索引完成后，Ram B就需要清空了。</p>
<p><a href="http://www.kafka0102.com/wp-content/uploads/2010/05/timeline.jpg"><img class="aligncenter size-full wp-image-135" title="zoie index timeline" src="http://www.kafka0102.com/wp-content/uploads/2010/05/timeline.jpg" alt="" width="635" height="476" /></a></p>
<h2>删除数据</h2>
<p>Zoie没有提供删除索引的接口，它认为每一次的提交或者是add或者是update。在建索引时，Zoie先将document的uid映射成docid，如果发现docid已存在，就需要标记删除该doc。lucene里表示删除标记的文件是xx.del，Zoie当然会最终将标记更新到这个文件，但因为索引结构有两个Ram index和一个disk index，并且不能每一次标记删除就更新disk index，所以Zoie在两种SearchIndex对象里记录了删除标记。当建索引，Zoie同时更新三个SearchIndex内存索引的删除标记，而在查询时会过滤掉被删除的doc。Zoie还提供了expungeDeletes方法来清除disk index中垃圾索引数据，这个操作因为耗时长而适合在凌晨进行，但查看Zoie的代码，这个操作只提供了通过JMX手动实现而没有自动执行的时机。</p>
<h2>ZoieMergePolicy</h2>
<p>Zoie的索引合并策略实现可以说是它的很大亮点。lucene中默认使用的MergePolicy是LogByteSizeMergePolicy，这个MergePolicy在选择合并的segment时，是计算segment的总的字节大小。这种方式的一个缺陷是，像用户profile这种如果update操作多的话（每次update会有一次delete操作），会使得一些segment看起来很大，实际上其中有效的索引数据会很少，这些无用索引数据会给查询带来负担。ZoieMergePolicy在计算索引大小时就去除了已删除的doc，使计算更加精确，下图是Zoie给出的两种MergePolicy的性能对比，随着时间的增长，因为被标记delete的doc越来越多，LogByteSizeMergePolicy的查询性能就下降的很厉害了。但是，如果每天低峰期做一次expungeDeletes操作，并且每天提交的delete操作不多的话，LogByteSizeMergePolicy的问题也不是很大。还有一点，Zoie对segment的数量处理上，默认是最多大段10个、小段20个（可通过合并引子控制），通常段数保持在十几个，因为段数比较多，查询时的性能会受些影响，好处是一些旧的大段不会被频繁合并。</p>
<p><a href="http://www.kafka0102.com/wp-content/uploads/2010/05/mergeperf.png"><img class="aligncenter size-full wp-image-134" title="zoie merge perf" src="http://www.kafka0102.com/wp-content/uploads/2010/05/mergeperf.png" alt="" width="640" height="480" /></a></p>
<h2>总结</h2>
<p>上面是对Zoie的实现的简要分析，如有理解不准确的误人之处，敬请指出并谅解。</p>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow: hidden;"><!-- 		@page { margin: 2cm } 		P { margin-bottom: 0.21cm } 		H2 { margin-bottom: 0.21cm } 		H2.western { font-family: "DejaVu Sans", sans-serif; font-size: 14pt; font-style: italic } 		H2.cjk { font-family: "AR PL UKai CN"; font-size: 14pt; font-style: italic } 		H2.ctl { font-family: "AR PL UKai CN"; font-size: 14pt; font-style: italic } 		A:link { so-language: zxx } --></p>
<p style="margin-bottom: 0cm;"><span style="font-family: AR PL UMing CN,serif;"> Zoie</span>是<span style="font-family: AR PL UMing CN,serif;">LinkedIn</span>开源的基于<span style="font-family: AR PL UMing CN,serif;">lucene</span>的实时检索系统，对于它的介绍及初步使用可参考我的上一篇文章“<span style="font-family: AR PL UMing CN,serif;"><a href="../2010/05/%E4%BD%BF%E7%94%A8zoie%E6%9E%84%E5%BB%BA%E5%AE%9E%E6%97%B6%E6%A3%80%E7%B4%A2%E7%B3%BB%E7%BB%9F/">http://www.kafka0102.com/2010/05/%E4%BD%BF%E7%94%A8zoie%E6%9E%84%E5%BB%BA%E5%AE%9E%E6%97%B6%E6%A3%80%E7%B4%A2%E7%B3%BB%E7%BB%9F/</a>”</span>。在初步研究并理解了<span style="font-family: AR PL UMing CN,serif;">Zoie</span>的源码实现后，本文分析一下<span style="font-family: AR PL UMing CN,serif;">Zoie</span>的实现。<span style="font-family: AR PL UMing CN,serif;"> </span></p>
<h2 class="cjk">实时检索的核心原理<span style="font-family: DejaVu Sans,sans-serif;"> </span></h2>
<p style="margin-bottom: 0cm;"><span style="font-family: AR PL UMing CN,serif;"> </span>通常的检索系统中，建索引和查询是分开的，即建索引是离线的，新的索引会以一定频率（比如每隔<span style="font-family: AR PL UMing CN,serif;">5</span>分钟）供查询端使用。对于一些站内检索来说，这种延迟性使得，不需要建索引的速度足够快（只要能跟的上提交频率就行），查询的效果不必完全精确。而要取得实时检索效果，典型的思路是：建索引和查询是在一个进程内，这样每一次的添加索引都会被下一次的查询用到，但这里面的细节还是需要好好琢磨解决的，下面就给出<span style="font-family: AR PL UMing CN,serif;">Zoie</span>的基于<span style="font-family: AR PL UMing CN,serif;">Lucene</span>的解决方案：索引分两种，<span style="font-family: AR PL UMing CN,serif;">ram index</span>和<span style="font-family: AR PL UMing CN,serif;">disk index</span>。建索引的过程是：首先建立<span style="font-family: AR PL UMing CN,serif;">ram index</span>，因为是内存操作，这个过程通常较快，建完后会重新打开<span style="font-family: AR PL UMing CN,serif;">IndexReader</span>，使查询端能看到最新的索引；当内存中的索引文档数达到阈值（<span style="font-family: AR PL UMing CN,serif;">10000</span>）或者间隔时间达到阈值（自定义），一个后台线程就将<span style="font-family: AR PL UMing CN,serif;">ram index</span>合并到<span style="font-family: AR PL UMing CN,serif;">disk index</span>里去，完成后清空已经无用的<span style="font-family: AR PL UMing CN,serif;">ram index</span>，并重新打开<span style="font-family: AR PL UMing CN,serif;">disk index</span>的<span style="font-family: AR PL UMing CN,serif;">IndexReader</span>供查询使用（这里面有个<span style="font-family: AR PL UMing CN,serif;">autowarm IndexReader</span>的过程）。特别指出的是，<span style="font-family: AR PL UMing CN,serif;">Zoie</span>的<span style="font-family: AR PL UMing CN,serif;">ram index</span>有两个，这使得当一个<span style="font-family: AR PL UMing CN,serif;">ram index</span>在和<span style="font-family: AR PL UMing CN,serif;">disk index</span>做合并操作时（这个过程可能会很耗时），另一个<span style="font-family: AR PL UMing CN,serif;">ram index</span>仍能提供建索引的操作。对于查询，使用的索引就包括两个<span style="font-family: AR PL UMing CN,serif;">ram index</span>和一个<span style="font-family: AR PL UMing CN,serif;">disk index</span>，所以只要索引在内存里建好，就能查询到最新的数据。<span style="font-family: AR PL UMing CN,serif;"> </span></p>
<p style="margin-bottom: 0cm;"><span style="font-family: AR PL UMing CN,serif;"> </span></p>
<h2 class="cjk">实现概览<span style="font-family: DejaVu Sans,sans-serif;"> </span></h2>
<p style="margin-bottom: 0cm;"><span style="font-family: AR PL UMing CN,serif;"> </span>下面简要说明<span style="font-family: AR PL UMing CN,serif;">Zoie</span>的核心接口和类。<span style="font-family: AR PL UMing CN,serif;"> </span></p>
<p style="margin-bottom: 0cm;"><span style="font-family: AR PL UMing CN,serif;"> ZoieSystem</span>：这个类是对外的核心类，它提供了诸多方法供外界使用，但它本身就像个<span style="font-family: AR PL UMing CN,serif;">Facade</span>，封装了其成员的一系列方法。<span style="font-family: AR PL UMing CN,serif;"> </span></p>
<p style="margin-bottom: 0cm;"><span style="font-family: AR PL UMing CN,serif;"> DataConsumer</span>：顾名思义，这个接口是用来消费数据也就是建索引的。实时建索引时，<span style="font-family: AR PL UMing CN,serif;">ZoieSystem</span>默认使用的<span style="font-family: AR PL UMing CN,serif;">DataConsumer</span>是<span style="font-family: AR PL UMing CN,serif;">RealtimeIndexDataLoader</span>。在<span style="font-family: AR PL UMing CN,serif;">consume</span>数据时，<span style="font-family: AR PL UMing CN,serif;">RealtimeIndexDataLoader</span>主要是将数据转换成内部结构后交给另一个<span style="font-family: AR PL UMing CN,serif;">DataConsumer</span>即<span style="font-family: AR PL UMing CN,serif;">RAMLuceneIndexDataLoader</span>真正在内存里建索引，之后如果当前处理的索引数达到阈值，<span style="font-family: AR PL UMing CN,serif;">RealtimeIndexDataLoader</span>会<span style="font-family: AR PL UMing CN,serif;">notify LoaderThread</span>，而<span style="font-family: AR PL UMing CN,serif;">LoaderThread</span>会调用<span style="font-family: AR PL UMing CN,serif;">DiskLuceneIndexDataLoader</span>来合并索引。<span style="font-family: AR PL UMing CN,serif;"> </span></p>
<p style="margin-bottom: 0cm;"><span style="font-family: AR PL UMing CN,serif;"> DiskSearchIndex</span>和<span style="font-family: AR PL UMing CN,serif;">RAMSearchIndex</span>：这两个类是<span style="font-family: AR PL UMing CN,serif;">Zoie</span>操作索引结构的，比如获取或打开指定目录的<span style="font-family: AR PL UMing CN,serif;">IndexReader</span>、<span style="font-family: AR PL UMing CN,serif;">IndexWriter</span>，更新索引写盘等操作。<span style="font-family: AR PL UMing CN,serif;"> </span></p>
<p style="margin-bottom: 0cm;"><span style="font-family: AR PL UMing CN,serif;"> DataProvider</span>：这个结构表示数据提供者。查看<span style="font-family: AR PL UMing CN,serif;">Zoie</span>代码，发现如果在索引的过程中程序挂掉，内存中的索引就有可能丢失，解决这个问题的方法可以是，在<span style="font-family: AR PL UMing CN,serif;">DataProvider</span>端做控制，最直接的，当重启程序时，重放之前一段时间的数据即可（因为<span style="font-family: AR PL UMing CN,serif;">Zoie</span>能做到定期刷数据，所以可计算出需要回放的时间点）。<span style="font-family: AR PL UMing CN,serif;"> </span></p>
<p style="margin-bottom: 0cm;"><span style="font-family: AR PL UMing CN,serif;"> </span></p>
<h2 class="cjk">建索引的过程<span style="font-family: DejaVu Sans,sans-serif;"> </span></h2>
<p style="margin-bottom: 0cm;"><span style="font-family: AR PL UMing CN,serif;"> </span>上面已经对建索引过程做了一些说明，下面配上<span style="font-family: AR PL UMing CN,serif;">Zoie wiki</span>上的图再形象化些。分析它的实现时，有个<span style="font-family: AR PL UMing CN,serif;">RAM</span>需要重点关注，它包含了两个<span style="font-family: AR PL UMing CN,serif;">RAMSearchIndex</span>（<span style="font-family: AR PL UMing CN,serif;">Ram A</span>和<span style="font-family: AR PL UMing CN,serif;">Ram B</span>）和一个<span style="font-family: AR PL UMing CN,serif;">DiskSearchIndex</span>对象成员，并且<span style="font-family: AR PL UMing CN,serif;">Ram A</span>和<span style="font-family: AR PL UMing CN,serif;">Ram B</span>也同时扮演<span style="font-family: AR PL UMing CN,serif;">Ram writebal</span>和<span style="font-family: AR PL UMing CN,serif;">Ram readable</span>，建索引时用的是<span style="font-family: AR PL UMing CN,serif;">Ram writebal</span>，查询时用的是<span style="font-family: AR PL UMing CN,serif;">Ram readable</span>。通过下面的图可以看到，<span style="font-family: AR PL UMing CN,serif;">Ram A</span>和<span style="font-family: AR PL UMing CN,serif;">Ram B</span>有个交换和清空的过程：<span style="font-family: AR PL UMing CN,serif;">1</span>）<span style="font-family: AR PL UMing CN,serif;">RAM</span>交换发生在<span style="font-family: AR PL UMing CN,serif;">Ram A</span>要合并到<span style="font-family: AR PL UMing CN,serif;">Disk Index</span>前，把<span style="font-family: AR PL UMing CN,serif;">A</span>的数据挪到<span style="font-family: AR PL UMing CN,serif;">Ram B</span>，使新的<span style="font-family: AR PL UMing CN,serif;">Ram A</span>开始接收处理客户端建索引请求，而<span style="font-family: AR PL UMing CN,serif;">Ram B</span>不再接收数据而专心合并索引。<span style="font-family: AR PL UMing CN,serif;">2</span>）在合并索引完成后，<span style="font-family: AR PL UMing CN,serif;">Ram B</span>就需要清空了。<span style="font-family: AR PL UMing CN,serif;"> </span></p>
<h2 class="cjk">删除数据<span style="font-family: DejaVu Sans,sans-serif;"> </span></h2>
<p style="margin-bottom: 0cm;"><span style="font-family: AR PL UMing CN,serif;"> Zoie</span>没有提供删除索引的接口，它认为每一次的提交或者是<span style="font-family: AR PL UMing CN,serif;">add</span>或者是<span style="font-family: AR PL UMing CN,serif;">update</span>。在建索引时，<span style="font-family: AR PL UMing CN,serif;">Zoie</span>先将<span style="font-family: AR PL UMing CN,serif;">document</span>的<span style="font-family: AR PL UMing CN,serif;">uid</span>映射成<span style="font-family: AR PL UMing CN,serif;">docid</span>，如果发现<span style="font-family: AR PL UMing CN,serif;">docid</span>已存在，就需要标记删除该<span style="font-family: AR PL UMing CN,serif;">doc</span>。<span style="font-family: AR PL UMing CN,serif;">lucene</span>里表示删除标记的文件是<span style="font-family: AR PL UMing CN,serif;">xx.del</span>，<span style="font-family: AR PL UMing CN,serif;">Zoie</span>当然会最终将标记更新到这个文件，但因为索引结构有两个<span style="font-family: AR PL UMing CN,serif;">Ram index</span>和一个<span style="font-family: AR PL UMing CN,serif;">disk index</span>，并且不能每一次标记删除就更新<span style="font-family: AR PL UMing CN,serif;">disk index</span>，所以<span style="font-family: AR PL UMing CN,serif;">Zoie</span>在两种<span style="font-family: AR PL UMing CN,serif;">SearchIndex</span>对象里记录了删除标记。当建索引，<span style="font-fami