<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"><channel><title>Disruptive Library Technology Jester &#187; apache</title> <atom:link href="http://dltj.org/tag/apache/feed/" rel="self" type="application/rss+xml" /><link>http://dltj.org</link> <description>We&#039;re Disrupted, We&#039;re Librarians, and We&#039;re Not Going to Take It Anymore</description> <lastBuildDate>Mon, 06 Feb 2012 20:04:22 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <cloud domain='dltj.org' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' /> <creativeCommons:license>http://creativecommons.org/licenses/by-nc-sa/3.0/us/</creativeCommons:license> <item><title>Fronting Tomcat with Apache HTTPD to Remove Ports and Context Paths</title><link>http://dltj.org/article/apache-httpd-and-tomcat/</link> <comments>http://dltj.org/article/apache-httpd-and-tomcat/#comments</comments> <pubDate>Thu, 20 Sep 2007 02:31:33 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Raw Technology]]></category> <category><![CDATA[apache]]></category> <category><![CDATA[howto]]></category> <category><![CDATA[httpd]]></category> <category><![CDATA[tomcat]]></category> <category><![CDATA[usability]]></category> <category><![CDATA[web development]]></category><guid isPermaLink="false">http://dltj.org/2007/09/apache-httpd-and-tomcat/</guid> <description><![CDATA[In this How-To guide, I show a combination of software and configuration to clean up URLs by removing the port numbers of the Java servlet engine (Tomcat) and the context path of the application. The goal is to create &#8220;cool &#8230; <a href="http://dltj.org/article/apache-httpd-and-tomcat/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2007/09/apache-httpd-and-tomcat/"></abbr><p>In this How-To guide, I show a combination of software and configuration to clean up URLs by removing the port numbers of the Java servlet engine (Tomcat) and the context path of the application.  The goal is to create &#8220;<a href="http://www.w3.org/Provider/Style/URI" title="Hypertext Style: Cool URIs don't change.">cool URLs</a>&#8221; that are are short (removing the unnecessary context path) and follow conventions (using the default port &#8220;80&#8243; rather than &#8220;8080&#8243;).  OhioLINK also uses a custom access control module &#8212; built for Apache HTTPD &#8212; which makes the fronting of Apache HTTPD for Tomcat even more desirable.</p><p><h2>Requirement</h2><br />We&#8217;re making use of the latest line of development for the Apache HTTPD series: <a href="http://httpd.apache.org/docs/2.2/" title="Apache HTTPD 2.2.x documentation">version 2.2.x</a>.  The inclusion of <a href="http://httpd.apache.org/docs/2.2/mod/mod_proxy_ajp.html" title="Apache HTTPD mod_proxy_ajp documentation">mod_proxy_ajp</a> &#8212; replacing the custom &#8220;mod_jk&#8221; with a module that extends the httpd proxy engine &#8212; in the latest major release of HTTPD makes our task much easier.  This solution also uses HTTPD&#8217;s mod_rewrite and an add-on module called <a href="http://apache.webthing.com/mod_proxy_html/" title="mod_proxy_html Apache HTTPD module homepage">mod_proxy_html</a>.  No additions or changes are needed to the stock Tomcat installation.</p><p><h2>The Plan</h2><br />There are two overall tasks that we&#8217;re going to ask the HTTPD server to do.  First, receive the incoming HTTP request and proxy it to the Tomcat servlet engine using the AJP protocol.  Second, rewrite the URL paths of the headers and the X/HTML body from the Tomcat servlet engine to eliminate any instances of the context path.  In a visual sense, what we are trying to is rewrite the path so it can be processed by Tomcat (the green box) then remove the extraneous parts of the path in the resulting headers and X/HTML (the red box):</p><table cellpadding="2" cellspacing="0"><tr><td align="right"><i>Public Request URLs:&nbsp;&nbsp;</i></td><td colspan="2" align="right">http://e.com</td><td colspan="2">/remaining/path?and=params</td></tr><tr><td align="right"><i>URLs sent to Tomcat:&nbsp;&nbsp;</i></td><td>http://e.com</td><td style="background-color: #FFFFCC; padding-right: 0; margin-right: 0; border-left: 1px solid green; border-top: 1px solid green; border-bottom: 1px solid green;">:8080</td><td style="background-color: #FFFFCC; padding-left: 0; margin-left: 0; border-right: 1px solid green; border-top: 1px solid green; border-bottom: 1px solid green;">/context_path</td><td>/remaining/path?and=params</td></tr><tr><td align="right"><i>URLs as output by Tomcat:&nbsp;&nbsp;</i></td><td>http://e.com</td><td style="background-color: #FFFFCC; padding-right: 0; margin-right: 0; border-left: 1px solid red; border-top: 1px solid red; border-bottom: 1px solid red;">:8080</td><td style="background-color: #FFFFCC; padding-left: 0; margin-left: 0; border-right: 1px solid red; border-top: 1px solid red; border-bottom: 1px solid red;">/context_path</td><td>/next/page</td></tr><tr><td align="right"><i>URLs as seen by browser:&nbsp;&nbsp;</i></td><td colspan="2" align="right">http://e.com</td><td colspan="2">/next/page</td></tr></table><p>The first half of this problem, modifying a request as they come into the Apache HTTPD server, will be handled by a <a href="http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html" title="Apache HTTPD mod_rewrite documentation">mod_rewrite</a> rule that rewrites the request to something Tomcat can understand then internally redirects it Tomcat via the AJP proxy.  (Note that we are not using simply <a href="http://httpd.apache.org/docs/2.2/mod/mod_proxy.html#proxypass" title="'ProxyPass' directive in HTTPD mod_proxy documentation">ProxyPass</a> here because we want to send the request through the AJP interface to the Tomcat server, and <a href="http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewriterule" title="'RewriteRule' directive in HTTPD mod_rewrite documentation">RewriteRule</a> allows us to do that with a <code>[P]</code> flag at the end of the RewriteRule line.)  The second uses a combination of <a href="http://httpd.apache.org/docs/2.2/mod/mod_proxy.html#proxypassreverse" title="'ProxyPassReverse' directive in HTTPD mod_proxy documentation">ProxyPassReverse</a> (a part of Apache-supplied mod_proxy extension that adjusts the URL in the Location, Content-Location and URI headers), <a href="http://httpd.apache.org/docs/2.2/mod/mod_proxy.html#proxypassreversecookiepath" title="'ProxyPassReverseCookiePath' directive in HTTPD mod_proxy documentation">ProxyPassReverseCookiePath</a> (also a part of the Apache-supplied mod_proxy extension; it rewrites the path string in Set-Cookie headers), and <a href="http://apache.webthing.com/mod_proxy_html/config.html" title="mod_proxy_html configuration documentation">ProxyHTMLURLMap</a> (from mod_proxy_html, a third-party extension that rewrites URLs inside X/HTML documents).</p><p><h2>Preparations</h2><br />The &#8216;mod_proxy_html&#8217; extension is likely new to your Apache HTTPD installation, so we need to download the source, compile it, and move it into the proper directory.  Fortunately, this is rather straight forward:</p><div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">wget</span> <span style="color: #ff0000;">'http://apache.webthing.com/mod_proxy_html/mod_proxy_html-2.5.2.c'</span>
apxs <span style="color: #660033;">-c</span> -I<span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>include<span style="color: #000000; font-weight: bold;">/</span>libxml2 <span style="color: #660033;">-i</span> mod_proxy_html-2.5.2.c</pre></div></div><p>Note that we are not using the mod_proxy_html author&#8217;s 3.0 version here.  In my set-up, the 3.0 version was causing Apache HTTPD to dump core on <em>every</em> request (whether proxied or not), and the prior release works just fine for our purposes.  The <code>apxs</code> line will compile, link, and copy the resulting library to the Apache modules directory for us.</p><p><h2>The Configuration</h2><br />This is the contents a &#8216;tomcat-proxy.conf&#8217; file that is placed in the &#8216;conf.d&#8217; directory of the Apache HTTPD configuration directory (most likely <code>/etc/httpd/conf.d/tomcat-proxy.conf</code>, although your installation may vary).</p><div class="wp_syntax"><div class="code"><pre class="config" style="font-family:monospace;">#
#  Information about 'mod_proxy_html' can be found at 
#   http://apache.webthing.com/mod_proxy_html/
LoadFile    /usr/lib/libxml2.so
LoadModule  proxy_html_module    modules/mod_proxy_html-2.5.2.so
&nbsp;
# DON'T TURN ProxyRequests ON!  Bad things will happen
# http://httpd.apache.org/docs/2.2/mod/mod_proxy.html#access
# http://www.akadia.com/services/prevent_abuse_proxy.html
ProxyRequests off
&nbsp;
# Necessary to have mod_proxy_html do the rewriting
RequestHeader      unset  Accept-Encoding
&nbsp;
# Rewrite the URLs to proxy (&quot;[P]&quot;) into the Tomcat server
RewriteEngine     on
RewriteRule ^(/.*)      ajp://localhost:8009/context_path/$1    [P]
&nbsp;
# Be prepared to rewrite the HTML/CSS files as they come back
# from Tomcat
SetOutputFilter proxy-html
&nbsp;
# Rewrite JavaScript and CSS files in addition to HTML files
ProxyHTMLExtended on
&nbsp;
# Output Strict XHTML (add &quot;Legacy&quot; to the end of the line below
# to output Transitional XHTML)
ProxyHTMLDoctype XHTML 
&nbsp;
# Rewrite HTTP headers and HTML/CSS links for everything else
ProxyPassReverse /context_path/ /
ProxyPassReverseCookiePath /context_path/ /
ProxyHTMLURLMap /context_path/ /</pre></div></div><p>That&#8217;s pretty much all there is to it.  You should note that mod_proxy_html, like any HTML scraper, requires modestly well-formed X/HTML.  If the markup is bad, the output from mod_proxy_html is likely to be unpredictable.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/apache-httpd-and-tomcat/feed/</wfw:commentRss> <slash:comments>10</slash:comments> </item> <item><title>Solr-ized MARC Record Catalog</title><link>http://dltj.org/article/miami-video-solr/</link> <comments>http://dltj.org/article/miami-video-solr/#comments</comments> <pubDate>Tue, 05 Jun 2007 03:29:13 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Raw Technology]]></category> <category><![CDATA[apache]]></category> <category><![CDATA[lucene]]></category> <category><![CDATA[metadata]]></category> <category><![CDATA[opac]]></category> <category><![CDATA[solr]]></category><guid isPermaLink="false">http://dltj.org/2007/06/miami-video-solr/</guid> <description><![CDATA[Rob Casson of Miami University announced this weekend the beta availability of their video catalog. In a subsequent posting, Rob describes the user interface elements. Rob and the crew at Miami are seeking feedback on the interface, so if you &#8230; <a href="http://dltj.org/article/miami-video-solr/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2007/06/miami-video-solr/"></abbr><p>Rob Casson of Miami University <span class="removed_link" title="http://foam.lib.muohio.edu/blog/?p=13">announced this weekend</span> the <a href="http://beta.lib.muohio.edu/solr/videos/" title="Miami University Video Catalog">beta availability of their video catalog</a>.  In a subsequent posting, Rob <span class="removed_link" title="http://foam.lib.muohio.edu/blog/?p=14">describes the user interface elements</span>.  Rob and the crew at Miami are seeking feedback on the interface, so if you have some <a href="http://beta.lib.muohio.edu/solr/videos/feedback.php" title="Video Catalog Feedback Form">be sure to offer it to them</a>.</p><p>A couple of notes on the mechanisms Rob is using. <a href="http://lucene.apache.org/solr/" title="Apache Solr homepage">Apache Solr</a> is an open source enterprise search server based on the <a href="http://lucene.apache.org/java/" title="Apache Lucene homepage">Lucene Java</a> search library (also an Apache project).  You can think of Lucene as the raw indexing and search engine with Solr layered on top to provide a non-Java interface to a rich feature set.  What Miami has done is extract all of the bibliographic and related item records out of their Innovative Interface system, written programs to transform that data into XML, indexed it with Solr/Lucene and created a search interface.</p><p>Now what makes this really interesting is how much useful information is in the MARC record that doesn&#8217;t currently find its way into the WebPAC interface.  For instance, this snapshot shows the facets where the MARC record has fielded data that can be turned into browsable lists:<div style="border: 2px solid gray; font-size 90%; padding: .25em;"><img src="http://cdn.dltj.org/wp-content/uploads/2007/06/browse_by_lc_class1.png" alt="Browse Catalog by LC Class" title="Browse Catalog by LC Class" width="623" height="242" style="border-bottom: 1px dashed gray;" />Example from Miami University&#8217;s video catalog showing the available fielded data.</div><p> The corresponding <a href="http://holmes.lib.muohio.edu/search/X" title="Miami University Libraries:Advanced Keyword Search">WebPAC pre-search limits (for keyword searching)</a> only includes a subset of languages, media formats, locations and does not include topic, genre, LC/SuDoc classes, and coverage date.  In other words, there is a whole lot of information in the MARC record that isn&#8217;t being exposed in the normal WebPAC interface.  Since Miami is in full control over the data in the Solr-based index, though, they are free to include as much or as little in the end-user interface.</p><p>Combined with faceted browsing, this makes for a very simple and quick interface to narrow down a large set of records.  At the time of writing this entry, Miami&#8217;s video library consisted of 10,538 records.  In three clicks, one can narrow that down to <a href="http://beta.lib.muohio.edu/solr/videos/index.php?query=%5B%2A+TO+%2A%5D&#038;filter_name[]=lang_facet&#038;filter_value[]=fre&#038;filter_name[]=format_facet&#038;filter_value[]=DVD&#038;filter_name[]=topic_facet&#038;filter_value[]=Comedy+films&#038;keep_filters=1" title="http://beta.lib.muohio.edu/solr/videos/index.php?query=%5B%2A+TO+%2A%5D&#038;filter_name[]=lang_facet&#038;filter_value[]=fre&#038;filter_name[]=format_facet&#038;filter_value[]=DVD&#038;filter_name[]=topic_facet&#038;filter_value[]=Comedy+films&#038;keep_filters=1">the 12 French-language comedy films in DVD format</a>:<div style="border: 2px solid gray; font-size 90%; padding: .25em;"><img src="http://cdn.dltj.org/wp-content/uploads/2007/06/french_dvd_comedy_films1.png" alt="Miami Video Catalog browse for French comedies in DVD" title="Miami Video Catalog browse for French comedies in DVD" width="603" height="71" border="0" style="border-bottom: 1px dashed gray;" />Example from Miami University&#8217;s video catalog facets after browsing for French-language Comedies on DVD.</div><p> From this screen, to see what is available in all media formats one need just click the red &#8216;X&#8217; to the right of &#8220;DVD&#8221;.  Also note the &#8220;RSS Feed&#8221; symbol on the right side of this interface snapshot.  The results of any search/browse are immediately available as an RSS feed &#8212; a very convenient way to receive notifications of new titles that match this search!</p><p>Congratulations, Rob and everyone else at Miami that brought this interface into existence.  It is a nice model and something we all can learn from through your experiences.  Please keep us updated as the project continues.<p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://foam.lib.muohio.edu/blog/?p=13 on February 11th, 2011.</p><p style="padding:0;margin:0;font-style:italic;" class="removed_link">The text was modified to remove a link to http://foam.lib.muohio.edu/blog/?p=14 on February 11th, 2011.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/miami-video-solr/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Killing Off Runaway Apache Processes</title><link>http://dltj.org/article/die-apache-die/</link> <comments>http://dltj.org/article/die-apache-die/#comments</comments> <pubDate>Mon, 26 Feb 2007 22:03:10 +0000</pubDate> <dc:creator>Peter Murray</dc:creator> <category><![CDATA[Meta Category]]></category> <category><![CDATA[Raw Technology]]></category> <category><![CDATA[apache]]></category> <category><![CDATA[Gentoo]]></category> <category><![CDATA[system administration]]></category> <category><![CDATA[WordPress]]></category><guid isPermaLink="false">http://dltj.org/2007/02/die-apache-die/</guid> <description><![CDATA[Well, something is still going wrong on dltj.org &#8212; despite previous performance tuning efforts, I&#8217;m still running into cases where machine performance grinds to a halt. In debugging it a bit further, I&#8217;ve found that the root cause is an &#8230; <a href="http://dltj.org/article/die-apache-die/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<abbr class="unapi-id ignore noPrint" title="http://dltj.org/2007/02/die-apache-die/"></abbr><p>Well, something is still going wrong on <i>dltj.org</i> &mdash; despite <a href="http://dltj.org/2007/02/wordpress-mysql-tuning/">previous performance tuning efforts</a>, I&#8217;m still running into cases where machine performance grinds to a halt.  In debugging it a bit further, I&#8217;ve found that the root cause is an apache httpd process which wants to consume nearly all of real memory which then causes the rest of the machine to <a href="http://en.wikipedia.org/wiki/Thrash_%28computer_science%29" title="Wikipedia: Thrash">thrash</a> horribly.  The problem is that I haven&#8217;t figured out what is causing that one thread to want to consume so much RAM &mdash; nothing unusual appears in either the access or the error logs and I haven&#8217;t figured out a way to debug a running apache thread.  (Suggestions anyone?)</p><div style="border: 1px solid black; color black; background: #EEE"><strong>Found it!</strong> It was a WordPress plug-in plus a change to the PHP configuration that was causing the problem.  The fix for the fundamental cause of the problem came from a comment timestamped February 8th, 2007 at 3:55 pm on the <a href="http://www.elvery.net/drzax/2006/02/10/footnotes-0-9-plugin-for-wordpress-2-0-x/" title="http://www.elvery.net/drzax/2006/02/10/footnotes-0-9-plugin-for-wordpress-2-0-x/">Footnotes 0.9 Plugin for WordPress 2.0.x</a> page.  An infinite loop was consuming both CPU cycles and RAM, and this was exacerbated by a change I made to the maximum CPU execution time for PHP scripts that was required in order to play with the <a href="http://blog.vimagic.de/ip-city-cluster-wordpress-plugin/" title="WordPress &amp;rsaquo; Error">IP City Cluster plug-in</a>.  With the patch to the Footnotes plug-in, <i>dltj.org</i> has gone 12 hours without a run-away apache process.</div><p>In any case, I whipped up this little ditty that is running every five minutes in cron as a way to gloss over the problem for the moment.  Running as root, it looks into all of the processes in the <a href="http://en.wikipedia.org/wiki/Procfs" title="Wikipedia: procfs">virtual /proc file system</a>, specifically in the &#8216;stat&#8217; file, and using <a href="http://en.wikipedia.org/wiki/AWK_%28programming_language%29" title="Wikipedia: AWK">awk</a> looks to see if the second space-delimited value is the name of the httpd process (this is the <a href="http://www.gentoo.org/" title="Gentoo Linux -- Gentoo Linux News">Gentoo Linux</a> distribution, so the name of the process is <tt>apache2</tt>) and the 23rd space-delimited value (the virtual size of the process) is bigger than 800MB.  If so, it prints out the PID of the process (the first value in the <tt>stat</tt> file) at which the bash script unceremoniously sends it a <tt>kill</tt> (&#8216;-9&#8242;) signal.  The script looks like this:</p><div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#!/bin/bash</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">for</span> i <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #000000; font-weight: bold;">`/</span>bin<span style="color: #000000; font-weight: bold;">/</span><span style="color: #c20cb9; font-weight: bold;">ls</span> <span style="color: #660033;">-d</span> <span style="color: #000000; font-weight: bold;">/</span>proc<span style="color: #000000; font-weight: bold;">/</span><span style="color: #7a0874; font-weight: bold;">&#91;</span><span style="color: #000000;">0</span>-<span style="color: #000000;">9</span><span style="color: #7a0874; font-weight: bold;">&#93;</span><span style="color: #000000; font-weight: bold;">*`</span>; <span style="color: #000000; font-weight: bold;">do</span>
        <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #660033;">-f</span> <span style="color: #007800;">$i</span><span style="color: #000000; font-weight: bold;">/</span><span style="color: #c20cb9; font-weight: bold;">stat</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span>; <span style="color: #000000; font-weight: bold;">then</span>
                <span style="color: #007800;">pid</span>=<span style="color: #000000; font-weight: bold;">`/</span>bin<span style="color: #000000; font-weight: bold;">/</span><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{ if ($2 == &quot;(apache2)&quot; &amp;amp;&amp;amp; $23 &amp;gt; 800000000) print $1}'</span> <span style="color: #007800;">$i</span><span style="color: #000000; font-weight: bold;">/</span><span style="color: #c20cb9; font-weight: bold;">stat</span><span style="color: #000000; font-weight: bold;">`</span>
                <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$pid</span>&quot;</span> <span style="color: #000000; font-weight: bold;">!</span>= <span style="color: #ff0000;">&quot;&quot;</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span>; <span style="color: #000000; font-weight: bold;">then</span>
                        <span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;Killing <span style="color: #007800;">$pid</span> because of load average: <span style="color: #780078;">`awk '{print $1}' /proc/loadavg`</span>&quot;</span>
                        <span style="color: #c20cb9; font-weight: bold;">kill</span> <span style="color: #660033;">-9</span> <span style="color: #007800;">$pid</span>
                <span style="color: #000000; font-weight: bold;">fi</span>
        <span style="color: #000000; font-weight: bold;">fi</span>
<span style="color: #000000; font-weight: bold;">done</span></pre></div></div><p>If anyone has any suggestions as to how to narrow down what the problem might be, I&#8217;d appreciate hearing from you.  I&#8217;ve tried eliminating WordPress plugins, recompiling WordPress and Apache, and attempted to catch the behavior with a network traffic sniffer, but have come up empty so far.</p>]]></content:encoded> <wfw:commentRss>http://dltj.org/article/die-apache-die/feed/</wfw:commentRss> <slash:comments>4</slash:comments> </item> </channel> </rss>
<!-- Served from: dltj.org @ 2012-02-11 09:37:08 by W3 Total Cache -->
