|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectcom.sun.portal.providers.ProviderAdapter
com.sun.portal.providers.ProfileProviderAdapter
com.sun.portal.providers.urlscraper.URLScraperProvider
A URLScraperProvider is a content provider that can retrieve and display content from a given URL.
URLScraperProvider acts as an HTTP client and makes a request for the content of the specified URL and then displays it in the channel.
Each URLScraper channel has its own timeout attribute. The channel will wait up to its individual timeout to receive content.
Forwarding of cookies
Each URLScraper channel has a cookiesToForwardList
attribute
that can be set on the in the display profile. If
a cookie is allowed by this attribute, a cookie in the request
coming from the browser will be forwarded to the web server specified
for the URL. allCookies
attribute can be set to true to allow
all the cookies. A set-cookie
request from that web server
will be sent back to the browser. The set-cookie
request
is modified so that the cookie is only sent back to the portal server.
URL Rewriting
The content gathered by the channel will be rewritten if
the rewriter is available. The ruleset used by the rewriter can be
specified in the display profile attribute rulesetID.
Relative URLs are converted to absolute URLs. For example, if your portal server is
http://portal.iplanet.com/
and the web server specified in the
URL is http://foo.sesta.com/
and the file contains
<IMG SRC="/images/blah.gif">
then the content sent back to browser via portal server will be
rewritten as:
<IMG SRC="http://foo.sesta.com/images/blah.gif">
Because otherwise the browser will attempt to read the image from
http://portal.sesta.com/images/blah.gif
and will not resolve it.
SSL protected pages
In general the URLScraperProvider will work with SSL pages. The
important thing to remember is that there can be no level of
interaction required by the specified URL as there is no way to
pass that information to the end user.
Timeouts
There are 2 timeout values to consider:
Encoding
The order for determining the encoding would be
HTTP header, if available (only applies to http(s) urls)
inputEncoding property, if non-blank
tag in content, e.g. meta tag in html & wml, xml header for xml, if available
(only applies to HTML, XML,WML determined based on the MIMEType)
system default
MIMEType is determined from the jvm table. If not set, it is determined
from the file extension.
Proxy Configuration
URLScraper channel uses a proxy to scrape the url specified
if the proxy is set in jvm12.conf file for web server
For Example the proxy can be set as
http.proxyHost=
http.proxyPort=
The refreshTime
attribute is used for caching and
will cause the URL not to be fetched again if the page is reloaded
within that time.
Field Summary | |
protected static String[][] |
typeTable
Array of File extensions mapped to the MIMETypes |
Fields inherited from interface com.sun.portal.providers.util.ProviderProperties |
ACTIVE_BULLET_IMAGE, ARRANGE_PROVIDER_JS, ATTACH_IMAGE, BANNER, BANNER_TEMPLATE, BANNER_TEMPLATE_NOCONTEXT, BARE_PROVIDER_WRAPPER_TEMPLATE, BG_COLOR, BGCOLOR, BORDER_COLOR, BORDER_SIZE, BORDER_WIDTH, BORDERLESS_CHANNELS, BRAND_BG_COLOR, BRAND_IMAGE, BRAND_IMAGE_BG_COLOR, BRAND_IMAGE_WIDTH, BRAND_IMAGE2, BRAND_IMAGE2_BG_COLOR, BULLET_COLOR, BULLET_COLOR_JS, CHANNEL_HIGHLIGHT_COLOR, CHANNEL_LINK_COLOR, CHANNELS_BACKGROUND_COLOR, CHANNELS_COLUMN, CHANNELS_HAS_FRAME, CHANNELS_IS_DETACHABLE, CHANNELS_IS_DETACHED, CHANNELS_IS_MAXIMIZABLE, CHANNELS_IS_MINIMIZABLE, CHANNELS_IS_MINIMIZED, CHANNELS_IS_MOVABLE, CHANNELS_IS_REMOVABLE, CHANNELS_ROW, CHANNELS_WIDTH, CONSUME_EVENT_LIST, CONTENT, CONTENT_BAR_IN_CONTENT, CONTENT_BAR_IN_CONTENT_TEMPLATE, CONTENT_BAR_IN_LAYOUT, CONTENT_BAR_IN_LAYOUT_TEMPLATE, CONTENT_LAYOUT, CONTENT_LAYOUT_LINK_COLOR, CONTENT_LAYOUT_TEMPLATE, CONTENT_LAYOUT_TEXT, CONTENT_TEMPLATE, DEFAULT_BORDERLESS_CHANNEL, DEFAULT_CHANNEL_COLUMN, DEFAULT_CHANNEL_HAS_FRAME, DEFAULT_CHANNEL_IS_DETACHABLE, DEFAULT_CHANNEL_IS_DETACHED, DEFAULT_CHANNEL_IS_MAXIMIZABLE, DEFAULT_CHANNEL_IS_MINIMIZABLE, DEFAULT_CHANNEL_IS_MINIMIZED, DEFAULT_CHANNEL_IS_MOVABLE, DEFAULT_CHANNEL_IS_REMOVABLE, DEFAULT_CHANNEL_ROW, DEFAULT_CHANNEL_WIDTH, DESKTOP_URL, DETACH_IMAGE, EDIT_CONTAINER_NAME, EDIT_IMAGE, EDIT_PROVIDER_TEMPLATE, EDIT_TEMPLATE, EMPTY_PROVIDER_CONTENT, ERR_MESSAGE, ERROR_TEMPLATE, ERROR_TEMPLATE_NOCONTEXT, EVENT_PORTLET_MAP, FONT_COLOR, FONT_FACE, FONT_FACE1, FONT_SIZE, FRONT_CONTAINER_NAME, FULLWIDTH_POPUP_HEIGHT, FULLWIDTH_POPUP_WIDTH, GENERATE_EVENT_LIST, GLOBAL_PORTLET_LIST, HAS_FRAME, HEADER_BG_COLOR, HEADER_FONT_COLOR, HEADER_TEXT, HELP_ICON, HELP_IMAGE, HELP_LINK, HELP_TAG, HELP_URL, HELP_URLS, INACTIVE_BULLET_IMAGE, INLINE_ERROR, INLINE_ERROR_TEMPLATE, LAST_CHANNEL_NAME, LAUNCH_POPUP, LAUNCH_POPUP_JS, LAYOUT, LAYOUT_FULL_BOTTOM_TEMPLATE, LAYOUT_FULL_TOP_TEMPLATE, LAYOUT1_TEMPLATE, LAYOUT2_TEMPLATE, LAYOUT3_TEMPLATE, LAYOUT4_TEMPLATE, LINK_SEPARATOR_COLOR, LOCALE_STRING, LOGOUT_URL, MAXIMIZE_IMAGE, MAXIMIZED_CHANNEL, MAXIMIZED_TEMPLATE, MENUBAR, MENUBAR_TEMPLATE, MINIMIZE_IMAGE, MINIMIZED_TEMPLATE, NORMALIZE_IMAGE, OPENURL_INPARENT_JS, OPTIONS_TEMPLATE, OVERLOAD_TEMPLATE, PARALLEL_CHANNELS_INIT, PARENT_CONTAINER_NAME, PARENT_TAB_CONTAINER, PERFORM_COLUMN_SUBSTITUTION_JS, PERFORM_SUBSTITUTION_JS, POPUP_MENUBAR_TEMPLATE, POPUP_TEMPLATE, PRODUCT_NAME, PROVIDER_CMDS, PROVIDER_NAME, PROVIDER_TITLE, PROVIDER_WRAPPER_TEMPLATE, REFRESH_PARENT_CONTAINER_ONLY, REMOVE_IMAGE, REMOVE_PROVIDER_JS, S_ATTACH_IMAGE, S_BRAND_IMAGE, S_BRAND_IMAGE2, S_DETACH_IMAGE, S_EDIT_IMAGE, S_HELP_IMAGE, S_MAXIMIZE_IMAGE, S_MINIMIZE_IMAGE, S_NORMALIZE_IMAGE, S_REMOVE_IMAGE, SELECT_ALL_JS, SELECTED_TAB_NAME, SIZE, STACK_TRACE, STATIC_CONTENT, SWITCH_COLUMNS_JS, TAB_COLOR, TAB_FONT_COLOR, TAB_NOTCH_IMAGE, TAB_PORTLET_LIST, TABLE_BG_COLOR, THEME_CHANNEL, THICK_POPUP_HEIGHT, THICK_POPUP_WIDTH, THIN_POPUP_HEIGHT, THIN_POPUP_WIDTH, TIMEOUT, TITLE, TITLE_BAR_COLOR, TITLE_FONT_COLOR, TITLE_TEXT, TOOLBAR_ROLLOVER, TOOLBAR_ROLLOVER_JS, USER_TEMPLATE |
Fields inherited from interface com.sun.portal.providers.ProviderWidths |
WIDTH_FULL_BOTTOM, WIDTH_FULL_TOP, WIDTH_THICK, WIDTH_THIN |
Fields inherited from interface com.sun.portal.providers.ProviderEditTypes |
EDIT_COMPLETE, EDIT_SUBSET |
Constructor Summary | |
URLScraperProvider()
Default constructor. |
Method Summary | |
StringBuffer |
getContent(javax.servlet.http.HttpServletRequest req,
javax.servlet.http.HttpServletResponse res)
Get the provider's content by retrieving content from specified URL. |
protected boolean |
getCookiesToForwardAll()
|
protected List |
getcookiesToForwardList()
|
StringBuffer |
getEdit(javax.servlet.http.HttpServletRequest req,
javax.servlet.http.HttpServletResponse res)
Calls the getEdit(Map) method in this object to provide backwards compatibility. |
protected File |
getFile(String pathname)
This method is called by getContent() if the url
returned by getURL() is a file url. |
protected StringBuffer |
getFileAsBuffer(String pathName)
Gets the specified file as StringBuffer |
protected String |
getFormData()
|
protected String |
getHttpAuthPassword()
|
protected String |
getHttpAuthUid()
|
protected StringBuffer |
getHttpContent(javax.servlet.http.HttpServletRequest req,
javax.servlet.http.HttpServletResponse res,
String url)
Get the provider's content by retrieving content from the specified http or https URL. |
protected StringBuffer |
getHttpContent(javax.servlet.http.HttpServletRequest req,
javax.servlet.http.HttpServletResponse res,
String url,
boolean ubt)
Get the provider's content by retrieving content from the specified http or https URL. |
String |
getInputEncoding()
Gets the inputEncoding to be used by content. |
protected String |
getLoginFormData()
|
protected String |
getLoginUrl()
|
protected String |
getLogoutUrl()
|
protected String |
getRuleSetID()
Gets the urlScraperRulesetID to be used by rewriter. |
protected int |
getTimeout()
Gets the timeout property for the provider. |
protected String |
getURL()
Gets the url property for the provider. |
protected boolean |
isHttpAuth()
|
boolean |
isPresentable(javax.servlet.http.HttpServletRequest request)
Determines presentability for channels based on this provider. |
URL |
processEdit(javax.servlet.http.HttpServletRequest req,
javax.servlet.http.HttpServletResponse res)
Calls the processEdit(Map) method in this object to provide backwards compatibility. |
Methods inherited from class com.sun.portal.providers.ProviderAdapter |
getContent, getDescription, getEdit, getEditType, getHelp, getHelp, getName, getProviderContext, getRefreshTime, getResourceBundle, getResourceBundle, getTitle, getWidth, init, isEditable, isPresentable, processEdit |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
protected static String[][] typeTable
Constructor Detail |
public URLScraperProvider()
Method Detail |
protected int getTimeout() throws ProviderException
ProviderException
- if there is an error getting the timeout
property.ProviderException
protected String getURL() throws ProviderException
Gets the url property for the provider. This is the URL from where the contents are fetched
ProviderException
- if there is an error getting the URL
property.ProviderException
protected String getRuleSetID() throws ProviderException
Gets the urlScraperRulesetID to be used by rewriter.
ProviderException
- if there is an error getting the
urlScrapperRulesetID.ProviderException
public String getInputEncoding() throws ProviderException
Gets the inputEncoding to be used by content. This method returns the inputEncoding which would be used in encoding the scraped content.
ProviderException
- if there is an error getting the
input encoding.ProviderException
public boolean isPresentable(javax.servlet.http.HttpServletRequest request)
isPresentable
in interface Provider
isPresentable
in class ProviderAdapter
request
- the HttpServletRequest
Provider.isPresentable(javax.servlet.http.HttpServletRequest)
public StringBuffer getContent(javax.servlet.http.HttpServletRequest req, javax.servlet.http.HttpServletResponse res) throws ProviderException
Get the provider's content by retrieving content from specified
URL.
This method internally calls getHttpContent
when the url
returned from getURL()
is a http or https url.
This method wraps certain exceptions thrown, into an error message to
display as the channel content.
getContent
in interface Provider
getContent
in class ProviderAdapter
req
- An HttpServletRequest that contains information related
to this request for content.res
- An HttpServletResponse that allows the provider to
influence the overall response for the desktop page
(besides generating the content).
ProviderException
- if there was an error generating the
content.ProviderException
,
getHttpContent(javax.servlet.http.HttpServletRequest, javax.servlet.http.HttpServletResponse, java.lang.String)
,
getURL()
protected StringBuffer getHttpContent(javax.servlet.http.HttpServletRequest req, javax.servlet.http.HttpServletResponse res, String url) throws InterruptedException, MalformedURLException, ProviderException
Get the provider's content by retrieving content from the specified http or https URL.
This method does not handle file URLs. It only handles http or https urls.
The content scraped from the specified url is rewritten if a rewriter is
available using the ruleset returned by getRuleSetID()
This method throws exceptions for certain exceptional conditions instead
of returning an error message in the returned StringBuffer
req
- An HttpServletRequest that contains information related
to this request for content.res
- An HttpServletResponse that allows the provider to
influence the overall response for the desktop page
(besides generating the content).url
- http or https url string
InterruptedException
- if there is a timeout while
trying to get the scraped content
MalformedURLException
- if the url passed in is not a valid
http or https url.
ProviderException
- if there was an error generating the
contentProviderException
,
getRuleSetID()
protected StringBuffer getHttpContent(javax.servlet.http.HttpServletRequest req, javax.servlet.http.HttpServletResponse res, String url, boolean ubt) throws InterruptedException, MalformedURLException, ProviderException
Get the provider's content by retrieving content from the specified http or https URL.
This method does not handle file URLs. It only handles http or https urls.
The content scraped from the specified url is rewritten if a rewriter is
available using the ruleset returned by getRuleSetID()
This method throws exceptions for certain exceptional conditions instead
of returning an error message in the returned StringBuffer
req
- An HttpServletRequest that contains information related
to this request for content.res
- An HttpServletResponse that allows the provider to
influence the overall response for the desktop page
(besides generating the content).url
- http or https url stringubt
- Indicates whether to track links external to portal
InterruptedException
- if there is a timeout while
trying to get the scraped content
MalformedURLException
- if the url passed in is not a valid
http or https url.
ProviderException
- if there was an error generating the
contentProviderException
,
getRuleSetID()
protected File getFile(String pathname)
getContent()
if the url
returned by getURL()
is a file url.
protected StringBuffer getFileAsBuffer(String pathName) throws IOException, ProviderException
IOException
ProviderException
- if there is an error getting the file
as StringBuffer.ProviderException
protected boolean getCookiesToForwardAll() throws ProviderException
ProviderException
protected List getcookiesToForwardList() throws ProviderException
ProviderException
protected boolean isHttpAuth() throws ProviderException
ProviderException
protected String getHttpAuthUid() throws ProviderException
ProviderException
protected String getHttpAuthPassword() throws ProviderException
ProviderException
protected String getLoginUrl() throws ProviderException
ProviderException
protected String getLogoutUrl() throws ProviderException
ProviderException
protected String getLoginFormData() throws ProviderException
ProviderException
protected String getFormData() throws ProviderException
ProviderException
public StringBuffer getEdit(javax.servlet.http.HttpServletRequest req, javax.servlet.http.HttpServletResponse res) throws ProviderException
ProviderAdapter
getEdit
in interface Provider
getEdit
in class ProviderAdapter
ProviderException
public URL processEdit(javax.servlet.http.HttpServletRequest req, javax.servlet.http.HttpServletResponse res) throws ProviderException
ProviderAdapter
The implementation of this method provides backwards compatibility for providers that only implement the deprecated processEdit(Map) method. It logs a warning informing the administrator that calling this method has performance implications, and that it should be re-implemented using the non-deprecated version of this method.
Each time this method is called, the HTTP parameter data in the request object must be converted to the Map form that is accepted by the processEdit(Map) version of this method.
processEdit
in interface Provider
processEdit
in class ProviderAdapter
ProviderException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |